Mutations in the katG gene useful for detection of M. tuberculosis

ABSTRACT

A method for selectively detecting M. tuberculosis is provided employing restriction fragment length polymorphism analysis of an enzymatic digest of the M. tuberculosis katG gene.

This is a continuation-in-part application U.S. patent application Ser. No. 08/418,782, filed Apr. 7, 1995, now U.S. Pat. No. 5,658,733, issued Aug. 19, 1997 which is a continuation-in-part of U.S. patent application Ser. No. 08/228,662, filed Apr. 18, 1994, now U.S. Pat. No. 5,688,639, issued Nov. 18, 1997 both of which are incorporated herein by reference.

BACKGROUND OF THE INVENTION

Despite more than a century of research since the discovery of Mycobacterium tuberculosis, the aetiological agent of tuberculosis, this disease remains one of the major causes of human morbidity and mortality. There are an estimated 3 million deaths annually attributable to tuberculosis (sec, D. Snider, Rev. Inf. Dis., S335 (1989)), and although the majority of these are in developing countries, the disease is assuming renewed importance in the W. due to the increasing number of homeless people and the impact the AIDS epidemic (see, R. E. Chaisson et al., Am. Res. Resp. Dis., 23, 56 (1987); D. E. Snider, Jr. et al., New Engl. J. Med, 326, 703 (1992); M. A. Fischl et al., Ann. Int. Med., 117, 177 (1992) and ibid. at 184.

Isonicotinic acid hydrazide or isoniazid (INH) has been used in the treatment of tuberculosis for the last forty years due to its exquisite potency against the members of the "tuberculosis" groups--Mycobacterium tuberculosis, M. bovis and M. africanum (G. Middlebrook, Am. Rev. Tuberc., 69, 471(1952) and J. Youatt, Am. Rev. Resp. Dis., 99, 729 (1969)). Neither the precise target of the drug, nor its made of action are known, but INH treatment results in the perturbation of several metabolic pathways of the bacterium. However, shortly after its introduction, INH-resistant isolates of Mycobacterium tuberculosis emerged. See M. L. Pearson et al., Ann. Int. Med., 117, 191(1992) and S. W. Dooley et al., Ann. Int. Med., 117, 257 (1992).

Several investigators have associated the toxicity of INH for mycobacteria with endogenous catalase activity. See, for example, "Isonicotinic acid hydrazide," in F. E. Hahn, Mechanism of Action of Antibacterial Agents, Springer-Verlag (1979) at pages 98-119. This relationship was strengthened by a recent report by Ying Zhang and colleagues in Nature, 358, 591 (1992) which described the restoration of INH susceptibility in an INH resistant Mycobacterium smegmatis strain after transformation using the catalase-peroxide (katG) gene from an INH sensitive M. tuberculosis strain. In a follow-up study, Zhang and colleagues in Molec. Microbiol., 8, 521 (1993) demonstrated the restoration of INH susceptibility in INH resistant M. tuberculosis strains after transformation by the functional katG gene. As reported by B. Heym et al., J. Bacieriol., 175, 4255 (1993), the katG gene encodes for a 80,000-dalton protein.

A conclusive diagnosis of tuberculosis depends on the isolation and identification of the etiologic agent, Mycobacterium tuberculosis, which generally requires 3-8 weeks. Design of an appropriate therapeutic regimen depends on the results of subsequent antituberculosis susceptibility testing by the agar dilution method and produces additional delays of 3-6 weeks (Roberts et al., "Mycobacerium" in Manual of Clincial Microbiology, 5th Ed.; A. Balows et al., Eds.; American Society for Microbiology: Washington; pp. 304-399 (1991). Identifcaiton and drug resistance testing can now also be accomplished more quickly by using the BACTEC radiometric method. (Tenover et al., J. Clin. Microbiol., 31 767-779 (1993) and Huebner et al., J. Clin. Microbiol., 31, 771-775 (1993). Acid fast bacilli are detected in the BACTEC bottle, and and identification is made using a nucleic acid hybridization technique on the BACTEC-derived growth. Drug susceptibility testing is then conducted using the same BACTEC growth to inoculate fresh BACTEC bottles containing various antituberculosis drugs. This procedure reduces the time needed to generate a complete analysis, but the total time required to report susceptibility results for MTB is typically in excess of 20 days. The need to minimize the transmission of newly identified drug resistant strains of MTB requires the development of much more rapid identification procedures.

The rapid detection of M. tuberculosis directly from clinical samples has been possible recently by virtue of the availability of polymerase chain reaction (PCR) and the recognition of diagnostic sequences amplified by the appropriate primers. The ability to conduct PCR analyses depends on having a high enough gene or gene product concentration so that the molecular tools work efficiently even when the organism numbers are low. Thus, the most efficient molecular assays used to detect M. tuberculosis depend on the IS6110 insertion sequence (about 10 copies) or the 16S ribosomal RNA (thousands of copies). See, respectively, K. D. Eisenach et al., J. Infect. Dis., 61, 997 (1990) and N. Miller et al., Abstracts ASM, Atlanta, Ga. (1993) at page 177. However, these methods do not provide any information regarding the drug-resistance phenotype of the M. tuberculosis strain.

Recently, B. Heym et al. (PCT WO 93122454) disclose the use of polymerase chain reaction to amplify portions of the katG gene of putative resistant strains. The PCR products were evaluated by single-strand conformation polymorphism (SSCP) analysis, wherein abnormal strand motility on a gel is associated with mutational events in the gene. For example, in five strains, a single base difference was found in a 200 bp sequence, a G to T transversion at position 3360. This difference would result in the substitution of Arg-461 by Leu. However, carrying out SSCP on a given clinical sample can be a laborious procedure that requires sequencing to confirm whether mutations or deletions predictive of drug resistance are in fact present in the target gene.

There is a continuing need in the art to develop a simple test permitting the rapid identification of M. tuberculosis and its drug-resistance phenotype.

SUMMARY OF THE INVENTION

The present invention provides a method to rapidly identify strains of M. tuberculosis which are resistant to isoniazid (INH). The method is based on our discovery that certain mutations in the katG gene of M. tuberculosis which confer INH resistance coincidentally result in the addition or deletion of restriction sites, which are recognized by various restriction enzymes. For example, the wild-type (WT) katG gene of M. tuberculosis contains an NciI-MspI restriction site spanning codon 463, which site is absent in the corresponding codon 463 in a number of INH resistant strains due to a single base mutation in that codon. An NciI-MspI) restriction site is a site cleaved by both NciI and MspI. This site is represented by the nucleotide sequence CCGGG (see Table 1).

Alternatively, or in addition, some INH resistant strains have a single base mutation in codon 315 in the katG gene that produces a new MspI restriction site associated with the corresponding codon 315 in the INH resistant strain. Some INH resistant strains have a single base mutation at codon 337 that results in the deletion of a RsaI restriction site otherwise present at the corresponding position in the WT gene; and some have a single base mutation at codon 264 that eliminates a CfoI restriction site. These mutations may be present singly or in combination in INH resistant M. tuberculosis strains.

When used in reference to nucleotide position, codon position or restriction site position, the term "corresponding" is defined to mean the same absolute location on two different M. tuberculosis katG genes, wherein absolute location is defined by the numbering system used in FIG. 7 (SEQ ID NO:20). For example, a wild-type codon 463 represented by CGG at nucleotide positions 1456-1458 on a wild-type katG gene of M. tuberculosis and a mutant codon 463 represented by CTG at the same nucleotide positions 1456-1458 on a katG gene of an INH resistant strain of M. tuberculosis are considered to be corresponding codons.

The determination of whether one or more of these identifying mutations in the katG gene are present in a strain of M. tuberculosis can be made by employing the techniques of restriction fragment length polymorphism (RFLP) analysis. Therefore, in an embodiment directed to the identification of a mutation in codon 463 that is associated with INH resistance, the present assay comprises the steps of:

(a) amplifying a portion of the katG gene of an M. tuberculosis isolate to yield a detectable amount of DNA comprising the nucleotide position occupied by base 1457 of the M. tuberculosis katG gene consensus sequence depicted in FIG. 7 (SEQ ID NO:20); and

(b) determining whether an NciI-MspI restriction site is absent in codon 463 of said katG gene, wherein said absence is indicative of an INH resistant strain of M. tuberculosis.

The RFLP technique involves cleaving the DNA with a restriction endonuclease which cleaves at an NciI-MspI restriction site to yield at least one DNA fragment and determining whether the number and location of the fragments is indicative of the absence of an NciI-MspI restriction site in codon 463 of said katG gene, wherein said absence is indicative of an INH resistant strain of M. tuberculosis, preferably by employing the techniques of gel electrophoresis.

If the amplified DNA of step (a) contains no NciI-MspI restriction sites, then the DNA fragment yielded in step (b) will be identical to the amplified DNA of step (a). This can occur where the portion of the katG gene amplified in step (a) is from an INH resistant strain of M. tuberculosis having a mutation in codon 463 that removes the NciI-MspI restriction site spanning that codon in the wild-type katG gene, and having no other additional NciI-MspI restriction sites.

In order for the amplified DNA to yield a meaningful RFLP pattern, the portion of the katG gene amplified in step (a) will be of sufficient length to produce fragments of sufficient length to visualize using gel electrophoresis. In the above-described embodiment, for example, the portion amplified will contain a sufficient number of bases to either side (5' or 3') of codon 463 such that cleavage at a site spanning that codon will yield fragments that can be visualized using gel electrophoresis.

In another embodiment of the invention directed to the additional identification of a mutation in codon 315 associated with INH resistance, the amplified DNA of step (a) further comprises at least one MspI restriction site and the nucleotide position occupied by base 1013 (FIG. 7, SEQ ID NO:20), and the determination made in step (b) further includes whether an MspI restriction site associated with codon 315 is present, wherein said presence is indicative of an INH resistant strain. For example, RFLP can also be employed to determine whether the number and location of the fragments is indicative of the codon 315 MspI restriction site. Preferably, the portion of the katG locus which is amplified is a minor portion of the entire katG gene, i.e., less than 1500 base pair, more preferably less than 1000 base pair, and is isolated and amplified by polymerase chain reaction, as described hereinbelow. The term "location" refers to the Rf (relative electrophoretic mobility) of a given fragment on the gel.

The pattern of fragments produced on a gel by electrophoresis of a restriction digest of an amplified portion of the katG gene of an M. tuberculosis strain of interest, such as an INH resistant strain, is preferably compared to the pattern produced in a digest of an equivalent portion of the katG gene of a wild-type (WT) control strain of M. tuberculosis, which strain is INH sensitive. The term "equivalent" is defined herein to mean that any two portions of the katG gene would comprise the same number and location of restriction sites being analyzed (e.g., sites recognized by CfoI, RsaI, MspI, and/or NciI) if the portions both were selected from a portion of the DNA of SEQ ID NO:20 (i.e., if there were no mutations altering the number of restriction sites of the type being analyzed), and that the portions do not differ in size before cleavage to the extent that the number of fragments obtained cannot be compared following side-by-side gel electrophoresis and visualization of the resultant fragments, as described hereinbelow. For example, the control katG DNA can correspond to an equivalent portion of SEQ ID NO:20 (FIG. 7, upper sequence) comprising one or more of the codons of interest (e.g., codons 315 or 463) and their associated restriction sites. As discussed below, such a portion of DNA can be derived from strain H37Rv MC. A positive control corresponding to DNA fragments derived from a known INH resistant strain may also be used.

In the embodiment of the assay of the invention directed to the determination of the presence or absence of a NciI-MspI restriction site associated with codon 463, gel electrophoresis is employed to compare the number and location of the DNA fragments to the number and location of DNA fragments derived from cleavage of DNA derived from an equivalent portion of the katG gene wherein the NciI-MspI restriction site at codon 463 is present, wherein a determination of the absence of the restriction site at codon 463 in the katG gene is indicative of an INH resistant strain of M. tuberculosis. Preferably, the control DNA sequence of the portion of the katG gene wherein the codon 463 restriction site is present corresponds to a portion of SEQ ID NO:20 (FIG. 7, upper sequence). For example, the control DNA may contain five NciI-MspI restriction sites in each DNA molecule prior to cleavage, and the DNA of step (a), which is derived from an INH resistant strain, may contain four NciI-MspI restriction sites in each DNA molecule prior to cleavage. The assay also preferably includes positive control DNA fragments derived from an INH resistant strain which does not include the codon 463 NciI-MspI restriction site in the katG gene.

The present invention also provides method for selectively detecting M. tuberculosis in a DNA sample, wherein the DNA is amplified to generate a detectable amount of amplified DNA comprising a katG DNA fragment which consists of base 904 through base 1523 of the M. tuberculosis katG gene. The generation of this katG DNA fragment is indicative of the presence of M. tuberculosis in the sample. Preferably, the DNA sample is a human biological tissue or fluid, more preferably a human biological fluid. The DNA sample is most preferably human sputum. Advantageously, the method of the invention can be performed on clinical human sputum samples with minimal pretreatment of the clinical sample.

In a preferred embodiment of the M. tuberculosis detection method, the katG DNA fragment has a restriction site that comprises either a G or a C at the nucleotide position occupied by base 1013 in codon 315 of the M. tuberculosis katG gene as depicted in FIG. 7 (SEQ ID NO:20), and the method further comprises contacting the katG DNA fragment with a restriction endonuclease, preferably MspI, that cleaves either at the restriction site comprising a G at the nucleotide position occupied by base 1013 of codon 315, or at the restriction site comprising a C at the nucleotide position occupied by base 1013 of codon 315, but not both, yielding at least one cleaved fragment. The at least one cleaved fragment is electrophoresed to yield an electrophoretic mobility pattern comprising the at least one cleaved fragment, and the mobility pattern is analyzed to selectively detect the presence of M. tuberculosis in the sample. Preferably, restriction fragment length polymorphism (RFLP) analysis is used to analyze the electrophoretic mobility pattern generated by gel electrophoresis of the cleaved fragments to detect M. tuberculosis.

In another embodiment of the M. tuberculosis detection method, M tuberculosis is selectively detected in a DNA sample by:

(a) amplifying the DNA to generate a detectable amount of amplified DNA comprising a katG DNA fragment comprising base 904 through base 1523 of the M. tuberculosis katG gene, wherein the katG DNA fragment further comprises a restriction site comprising either a G or a C at the nucleotide position occupied by base 1013 in codon 315 of the M. tuberculosis katG gene as depicted in FIG. 7 (SEQ ID NO:20);

(b) contacting the katG DNA fragment with a restriction endonuclease, preferably MspI, that cleaves either at said restriction site comprising a G at the nucleotide position occupied by base 1013 of codon 315, or at said restriction site comprising a C at the nucleotide position occupied by base 1013 of codon 315, but not at both of said restriction sites, to yield at least one cleaved fragment;

(c) electrophoresing, preferably using gel electrophoresis, the at least one cleaved fragment to yield an electrophoretic mobility pattern comprising the at least one cleaved fragment; and

(d) analyzing the mobility pattern, preferably using RFLP, to selectively detect the presence of M. tuberculosis in the sample.

In a further embodiment of the M. tuberculosis detection method, M. tuberculosis is selectively detected a sample containing DNA by:

(a) amplifying the DNA to generate a detectable amount of amplified DNA comprising a katG DNA fragment comprising base 904 through base 1523 of the M. tuberculosis katG gene as depicted in FIG. 7 (SEQ ID NO:20);

(b) contacting the katG DNA fragment with a restriction endonuclease, preferably MspI, that cleaves at C/CGG to yield at least one cleaved fragment;

(c) electrophoresing, preferably using gel electrophoresis, the at least one cleaved fragment to yield a mobility pattern comprising the at least one cleaved fragment; and

(d) analyzing the mobility pattern, preferably using RFLP, to selectively detect the presence of M. tuberculosis in the sample.

In a particularly preferred embodiment, the M. tuberculosis detection method further comprises determining whether or not the katG DNA fragment has a S315T mutation, preferably using a restriction digest followed by gel electrophoresis of the digested DNA and RFLP analysis of the electrophoretic mobility patterns, wherein the presence of a S315T mutation is indicative of an INH-resistant strain of M. tuberculosis.

The present invention also provides oligonucleotides and subunits thereof useful in pairs as primers to initiate the polymerase chain reaction (PCR). Subunits of at least seven bases in length are preferred. PCR is useful both to amplify katG DNA so as to prepare both the target DNA of step (a) of the present process, as well as the DNA which is used to prepare the control digest.

The present invention also provides isolated, purified DNA represented by the consensus sequence derived for the M. tuberculosis katG gene. This DNA was found to occur in nature as the katG gene of M. tuberculosis strain H37Rv MC, as maintained at the Mayo Clinic, and is also referred to as the wild-type (WT) DNA. The present invention also includes isolated, purified DNA encoding the consensus amino acid sequence encoded by the consensus wild-type katG DNA, as well as DNA sequences that differ in sequence but which also encode this amino acid sequence (a consensus catalase peroxidase polypeptidc) and can be employed to provide the isolated, purified polypeptide represented by the consensus amino acid sequence, which polypeptide is also provided by the invention.

The polypeptide of the invention can be prepared by expression in transformed host cells, such as bacteria, yeast, plant, or insect cells transformed with the DNA sequences of the present invention, operatively linked to regulatory regions functional in the transformed host cells. The polypeptide can be used as a standard M. tuberculosis catalase peroxidase, to correlate enzymatic activity (relative level, loss and restoration), with INH modification and degradation and drug resistance in M. tuberculosis.

The present invention also provides a kit comprising, separately packaged in association:

(a) a pair of oligonucleotide primers selected so as to amplify a portion of the DNA of the M. tuberculosis katG gene comprising base 1457 in codon 463 or base 1013 in codon 315, as depicted in FIG. 7 (SEQ ID NO:20); and

(b) an amount of a restriction endonuclease such as MspI, effective to cleave the amplified portion of said DNA at a restriction site comprising said base 1457 or said base 1013.

The present kits will also preferably comprise instruction means for carrying out the present assay, i.e., a printed package insert, tag or label, or an audio or video tape. The present kits will also preferably comprise a control DNA digest prepared by amplifying a portion of the consensus DNA of SEQ ID NO:20 (FIG. 7), that is equivalent to the portion defined and amplified by the pair of primers, followed by digestion of the DNA with a suitable restriction endonuclease such as MspI.

The present invention is exemplified by the use of NciI, MspI, CfoI, and RsaI digestions, with the use of MspI digestion being preferred; however, any restriction endonuclease having a restriction site spanning all or a portion of codon 463, codon 315, codon 337, or codon 264, which portion contains the site of the single base mutation associated with INH resistance as identified in Table 2, may be used, as desired. For example, the restriction endonucleases listed in Table 1 can be employed. Particularly preferred are restriction endonucleases having a restriction site that contains the position occupied by base 1457 in codon 463, or base 1013 in codon 315, as depicted in FIG. 7 (SEQ ID NO:20).

                                      TABLE 1     __________________________________________________________________________     M. tuberculosis*     katG gene     Specificity              Restriction Site                           Restriction Enzyme     __________________________________________________________________________     Cuts 264-A              C/CGC        AciI.sup.a     (sensitive)              GC/NGC.sup.b BsoFI, Fnu4HI, Bsp6I, BssFI,                           BssXI, Cac824I, CcoP215I,                           CcoP216I, FbrI, ItaI, Uur960I              R/GCGCY.sup.c                           Bsp143II, HaeII, Bme14ZI,                           BsmHI, Bst1473II, Bst16I,                           Btu34II, HinHI, LpnI, NgoI              G/CGC        CfoI, HhaI, BcaI, CcoP95I,                           Csp1470I, FnuDIII, Hin6I,                           Hin7I, HinGUI, HinPII, IlinSII,                           IlinS2I, MnnIV, SciNI     Cuts 264-T              GACGCNNNNN/NNNNN                           HgaI     (resistant)              (SEQ ID NO:22)     Cuts 337-Y              GT/AC        RsaI, AfaI,     (resistant)           Asp16HI, Asp17HI,                           Asp18HI, Asp29HI, CcoP73I,                           Csp6I, CviQI, CviRII     Cuts 337-C              GC/NGC       BsoFI, Fnu1HI, Bsp61, BssFI,     (resistant)           BssFI, BssXI, Cac824I,                           CcoP215I, CcoP216I, FbrI,                           ItaI, Uur960I              C/CGC        AciI.sup.a     Cuts 315-S              C/CGC        AciI.sup.a     (sensitive)              GC/NGC       BsoFI, Fnu4HI, Bsp6I, BssFI,                           BssXI, Cac824I, CcoP215I,                           CcoP216I, FbrI, ItaI, Uur960I              CMG/CKG.sup.d                           MspAII, NspBII     Cuts 315-T              R/CCGGY.sup.e                           BsrFI, Cfr10I, Bco118I,     (resistant)           Bse118I, Bsp21I, BssAI              C/CGG        MspI, Bsu1192I, BsuFI, FinII,                           HapII, Hin2I, Hin5I, HpaII,                           MniII, MnoI, MspI, Pde137I,                           Pme35I, SecII, SfaGUI,                           Sth134I, Uba1128I, Uba1141I,                           Uba1267I, Uba1338I,                           Uba1355I, Uba1439I     Cuts 463-R              CC/SGG.sup.f NciI, BcnI, AhaI     (sensitive)              C/CGG        MspI, Bsu1192I, BsuFI, FinII,                           HapII, Hin2I, Hin5I, HpaII,                           MniII, MnoI, MspI, Pde137I,                           Pme35I, SecII, SfaGUI,                           Sth134I, Uba1128I, Uba1141I,                           Uba1267I, Uba1338I,                           Uba1355I, Uba1439I     Cuts 463-L              CAG/NNN/CTG  AlwNI     (resistant)              CC/WGG.sup.g BstNI, BstOI, MvaI              /CCWGG       EcoRII     __________________________________________________________________________      .sup.a AciI cleaves the complementary strand of the katG gene;      .sup.b N = C or G or A or T;      .sup.c R = A or G;      .sup.d M = A or C, K = G or T;      .sup.e Y = C or T;      .sup.f S = C or G;      .sup.g W = A or T.

                                      TABLE 2.sup.a     __________________________________________________________________________     264-A (sensitive).sup.b             (SEQ ID NO:8)                      847                        GTC GAA ACA GCG GCG CTG ATC GTC GGC                                                 873     264-T (resistant)             (SEQ ID NO:9)                        GTC GAA ACA GCG ACG CTG ATC GTC GGC     337-Y (sensitive).sup.c             (SEQ ID NO:10)                     1066                        CTC GAG ATC CT G TAC GGC TAC GAG TGG                                                1092     337-C (resistant)             (SEQ ID NO:11)                        CTC GAG ATC CTG TGC GGC TAC GAG TGG     315-S (sensitive).sup.d             (SEQ ID NO:12)                     1000                        GAC GCG ATC ACC AGC GGC ATC GAG GTC                                                1026     315-T (resistant)             (SEQ ID NO:13)                        GAC GCG ATC ACC A CC GGC ATC GAG GTC     463-R (sensitive).sup.e             (SEQ ID NO:14)                     1444                        AAG AGC CAG AT C CGG GCA TCG GGA TTG                                                1470     463-L (resistant)             (SEQ ID NO:15)                        AAG AGC CAG ATC CTG GCA TCG GGA TTG     __________________________________________________________________________      .sup.a The underlined codons represent the sites where the indicated      single base mutations confer INH resistance. The bold bases indicate      restriction sites as follows: G/CGC for CfoI in 264A (sensitive); GT/AC      for RsaI in 337Y (sensitive); C/CGG for MspI in 315T (resistant) and 463R      (sensitive). For ease of reference, the partial sequences shown in this      table include the 12 bases to either side of the affected codon;  #the      numbering system is the same as used for the wildtype consensus sequence      in FIGS. 1 and 7. The full sequence of bases to either side of the      affected codon is shown in FIG. 7. In each of the sensitive/resistant      pairs shown in this table, the upper sequence is the consensus, wildtype      sequence (INRsensitive) and the lower sequence is the mutant      (INHresistant) sequence.      .sup.b codon 264      GCG = ala (A)      ACG = thr (T)      .sup.c codon 337      TAC = tyr (Y)      TGC = cys (C)      .sup.d codon 315      AGC = ser (S)      ACC = thr (T)      .sup.e codon 463      CGG = arg (R)      CTG = leu (L)

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1, panels A-D, depicts the consensus, wild-type DNA sequence of the M. tuberculosis katG gene as the upper of the pair of sequences (61-2295) (SEQ ID NO:1). This DNA sequence data has been submitted to Gen Bank and has been assigned accession number UO6262. The lower of the pair of sequences depicts nucleotide sequence 1970-4190 of the KpnI fragment bearing the katG gene as depicted in FIG. 6 of Institute Pasteur et al. (published PCT application WO 93/22454). This sequence (SEQ ID NO:2) has been deposited in the EMBL data library under accession number X68081 (Gen Bank X68081. gb₋₋ ba). Dots (.) above the sequence mark every tenth base. The upper sequence is in lower case in areas where variation in the sequence among isolates described hereinbelow and the consensus sequence was found. The arrow before position 70 and after position 2291 of the upper sequence indicate the coding sequence of the katG gene.

FIG. 2 depicts the katG amino acid consensus sequence derived from 15 strains of M. tuberculosis (SEQ ID NO:7).

FIG. 3 schematically depicts the NciI restriction sites for the part B amplicon of katG. The (*) depicts the site of the Arg→Leu mutation which is found in some INH resistant M. tuberculosis strains.

FIG. 4 depicts the results of a gel electrophoresis of the NciI digest of the part B amplicon of 14 strains of M. tuberculosis (1-14).

FIG. 5 schematically depicts MspI and RsaI restriction sites and resulting RFLP fragments for a portion of the M. tuberculosis katG gene. For MspI, restriction maps for wild-type (W+), single (315 Ser→Thr or 463 Arg→Leu) mutants and the double (315 Ser→Thr and 463 Arg→Leu) mutant are shown. For RsaI, restriction maps for wild-type (W+) and the 337 W→C mutant are shown.

FIG. 6 schematically depicts CfoI restriction sites and resulting RFLP fragments for a portion of the M. tuberculosis katG gene. Restriction maps for wild-type (W+) and the 264 A→T mutant are shown.

FIG. 7, panels A-C, depict as the upper of the pair of sequences the consensus, wild-type DNA sequence of the M. tuberculosis katG gene (SEQ ID NO:20), and as the lower of the pair of sequences the amino acid consensus sequence encoded thereby (SEQ ID NO:21). This information is updated from that presented in FIGS. 1 (SEQ ID NO:1) and 2 (SEQ ID NO:7) and the numbering system is as used therein. The amino acid and nucleotide sequences are arranged in this figure so as to facilitate convenient determination of which codons encode which amino acid in the polypeptide sequence.

FIG. 8 depicts the RFLP patterns produced by an MspI restriction digest of an amplified portion of the DNA of the katG genes of wild-type and mutant strains of M. tuberculosis, wherein the mutant DNA contains mutations at either codon 315 or codon 463, or both.

FIG. 9 depicts the RFLP patterns produced by an RsaI restriction digest of an amplified portion of the DNA of the katG genes of wild-type and mutant strains of M. tuberculosis, wherein the mutant DNA contain a mutation at codon 337.

FIG. 10 depicts the RFLP patterns produced by a CfoI restriction digest of an amplified portion of the DNA of the katG genes of wild-type and mutant strains of M. tuberculosis, wherein the mutant DNA contains a mutation codon 264.

DETAILED DESCRIPTION OF THE INVENTION

Wild type strains of M. tuberculosis are highly susceptible to isoniazid (INH) with minimum inhibitory INH concentration (MIC or IC_(min))≦0.02 μg/ml, and a susceptible strain is considered to be one with an IC_(min) <1.0 μg/ml. At the Mayo Clinic, Rochester, Minn., clinical strains of M. tuberculosis (including MDR-TB strains) were identified which exhibit intermediate to high level resistance to INH (IC_(min) range 1.0 to>32 μg/ml). Many of these strains, especially those highly resistant to INH (≧4.0 μg/ml), exhibited diminished catalase activity as assessed by a semiquantitative technique. The mean semiquantitative catalase was 16.5 mm for 6/15 strains with INH IC_(min) <1.0 μg/ml and 13.3 mm for 9/15 strains with IC_(min) ≧1.0 μg/ml.

To develop the present assay, it was first necessary to determine whether some M. tuberculosis strains have decreased INH sensitivity as a result of katG gene mutations. Therefore, the nucleic acid sequences of the katG genes for both INH sensitive (IC_(min) <1.0 μg/ml) and INH resistant (IC_(min) ≧1.0 μg/ml) M. tuberculosis strains were determined. From the DNA sequencing data generated, a katG consensus sequence was derived, and katG sequences from all 15 M. tuberculosis strains (INH sensitive and INH resistant) were compared to the consensus sequence to determine katG deviations.

Five of nine INH resistant strains (INH IC_(min) ≧1.0 μg/ml) had one or more missense mutations; one had a nonsense mutation; one had an 8 base pair deletion; and two had no mutations in the coding sequences. All of the five strains with missense mutations had a common G to T transversion at base 1457 in codon 463 (bases 1456-1458) causing replacement of arginine with leucine and loss of an NciI-MspI restriction site. Two of those having mutations at codon 463 also showed a G to C transversion at base 1013 in codon 315 (bases 1012-1014) causing replacement of serine with threonine. A third contained a G to A transversion at base 859 in codon 264 (bases 859-861) resulting in the replacement of alanine by threonine, and a fourth contained an A to G transversion at base 1079 in codon 337 (bases 1078-1080), causing tyrosine to be replaced by cysteine. The numbering system is shown in FIG. 1 (SEQ ID NO:1). The affected codons and portions of the DNA sequences on either side are shown for both INH sensitive and INH resistant strains in Table 2.

Six INH sensitive strains (INH IC_(min) <1.0 μg/ml) were also sequenced and found to have from none to 5 amino acid differences with the consensus sequence of all 15 strains, but none of the mutations affected codons 463, 315, 264, or 337 or their overlapping restriction sites. Restriction analysis of a total of 32 sensitive and 43 resistant strains revealed a common restriction fragment length polymorphism (RFLP) in nearly half (19) of the 43 of INH resistant strains, but only one of the INH sensitive strains. Specifically, 44% of the INH resistant had lost the NciI-MspI restriction site at the locus of codon 463 while only 1 of 32 sensitive strains had this restriction polymorphism.

Subsequently, the frequency of codon 463 (R→L) and codon 315 (S→T) mutations in 97 M. tuberculosis clinical isolates was determined. These isolates were obtained from patients treated at Mayo Clinic and samples referred from other health care institutions. Restriction fragment length polymorphism (RFLP) analysis using the MspI restriction enzyme, which cleaves at a site spanning the consensus codon 463 site and at a site comprising a portion of the mutant codon 315 site on the katG gene of M. tuberculosis, was performed on amplified DNA from 97 clinical isolates. Comparison of the resulting RFLP patterns and IC_(min) for isoniazid revealed that of the 90 INH-resistant strains, approximately 10% had both mutations, 20% had the 315 S→T mutation only, and 26% had the 463 R→L mutation only. Thus, 51 of the 90 resistant strains were identified by RFLP as having mutations at codons 463, 315 or both, resulting in a detection of over 50% of the resistant strains by this molecular method in a single experiment. Only one of the seven INH sensitive strains was found to have the 463 R→L mutation, and none of the INH sensitive strains had the 315 S→T mutation. Greater INH resistance (>4.0 μg, INH/ml) is associated with the 315 S→T mutation, but not if the 463 R→L mutation is also present.

These results indicate that two mutations, arginine→leucinc in codon 463 and serine→threonine in codon 315 of the M. tuberculosis catalase-peroxidase (katG) gene occur in a significant fraction of INH resistant M. tuberculosis strains (INH IC_(min) >1.0 μg/ml). Furthermore, these single base mutations can be determined using a rapid relatively simple method, i.e., PCR amplification, digestion and monitoring for a loss of an NciI and/or an MspI restriction site at codon 463, and the addition of an MspI restriction site at codon 315, by RFLP, as described in detail hereinbelow. Other restriction endonucleases can be used to determine whether or not these single base mutations exist in a katG gene of interest, as long as the restriction site cleaved by the restriction endonucleases contains the affected base, such that the endonuclease cleaves the wild-type sequence but not the corresponding mutant sequence, or vice versa. Although in a preferred embodiment of the invention, the number and location of the fragments is determined by gel electrophoresis, the presence or absence in the digest of a fragment comprising the indicated restriction sites can be determined by other methods known to the art, including immunoassays (dot blots and reverse dot blots), DNA probes, microtiter well capture and the like.

It was further found that the RFLP patterns produced by a MspI digest of the M. tuberculosis DNA fragment katG 904-1523 (FIG. 8, lanes B-S) are specific for M. tuberculosis and thus allow selective detection of M. tuberculosis. MspI digestion of any M. tuberculosis katG gene fragment containing the MspI restriction sites depicted in FIG. 5 is expected to yield a similar M. tuberculosis-specific RFLP pattern. In contrast, MspI digestions of DNA samples containing other microorganisms, such as mycobacteria other than M. tuberculosis (MOTT), yield either different, distinguishable RFLP patterns, or no detectable restriction fragments at all.

When oligonucleotide primers, preferably katG904 and katG1523, are used in a PCR to generate the 620 base pair M. tuberculosis DNA amplicon katG 904-1523 in a sample that contains M. tuberculosis, typically the PCR yields no amplicon at all for non-M. tuberculosis samples (i.e., the primers do not function to create an amplified product). Thus, the generation of the katG 904-1523 amplicon itself is indicative of the presence of M. tuberculosis in the sample. Subsequent enzymatic digestion of the amplicon, preferably using restriction endonuclease MspI, can be used to confirm the determination of M. tuberculosis.

The present invention will be further described by reference to the following detailed examples. The 58 clinical strains of Mycobacterium tuberculosis used in Examples 1 and 2 were obtained from the Mycobacteriology Laboratory at the Mayo Clinic, Rochester, Mo., and the 17 M. tuberculosis DNA preparations were obtained from the GWL Hansen's Disease Center, Louisiana State University, Baton Rouge, La. The strain designated H37Rv MC has been maintained at the Mayo Clinic for over 50 years, and therefore was isolated before INH became available as a treatment modality for tuberculosis (circa 1952). H37Rv was deposited in the American Type Tissue Collection, Rockville, Md. in 1937 by A. Karlson of the Mayo Clinic under the accession number ATCC 25618, and has been freely available to the scientific community since. An apparent variant of this strain is disclosed in PCT WO 93/22454 (SEQ ID NO:2, herein). The ATCC strains 27294 and 25618 were recovered from the same patient in 1905 and 1934, respectively. All clinical M. tuberculosis strains were confirmed as M. tuberculosis using routine identification techniques described by J. A. Washington, "Mycobacteria and Norcardia," in: Laboratory Procedures in Clinical Microbiology, 2d ed., Springer-Verlag, N.Y. (1985) at pages 379-417.

For the 15 M. tuberculosis strains for which complete katG DNA sequencing was performed, susceptibility testing was done at the Mayo Clinic using Middlebrook 7H11 agar (DiMed, Inc., St. Paul, Minn. 55113) and the 1% proportion method described in Manual of Clinical Microbiology, 5th ed., A. Balows et al., eds., Amer. Soc. Microbiol. (1991) at pages 1138-52. The same method was used at the Mayo Clinic to determine susceptibility for an additional 43 M. tuberculosis strains for which restriction fragment length polymorphisms (RFLP) were determined. Isoniazid concentrations tested using this method included: 0.12, 0.25, 1.0, 2, 4, 8, 16, 32 μg/ml for the 15 strains sequenced and 1.0 and 4.0 μg/ml for the remaining 45 strains. Isoniazid resistance was defined as a maximum inhibitory concentration (IC_(min))≧1.0 μg/ml. Susceptibility testing was performed elsewhere for an additional 17 M. tuberculosis strains for which DNA lysates were provided by Diana L. Williams, Baton Rouge, La. These strains were of diverse geographical origin. 10 of these 17 strains, originated from Japan. The remaining 7 M. tuberculosis INH resistant strains included multiple drug resistant strains from recent multiple drug resistant tuberculosis (MDR-TB) nosocomial epidemics in New York, N.Y. and Newark, N.J. All were INH resistant (IC_(min) ≧1.0 μg/ml), and had resistance to at least one other drug. For all strains provided by Williams, the 1% direct proportion method was used, but the concentration of INH tested, and the media used varied as to site.

To conduct a semiquantitative test of catalase activity, M. tuberculosis strains were propagated on Lowenstein-Jensen media deeps contained in 20×150 mm screw-capped tubes. One ml of a 30% hydrogen peroxide (EM, Science, Gibbstown, N.Y. 08027) and 10% Tween 80 (Aldrich Chemical Co., Milwaukee, Wis. 53233) solution mixed in a 1:1 ratio was applied to the surface of growth. After 5 minutes, the highest (mm) of the column of bubbles (O₂) generated was recorded.

EXAMPLE 1 DNA Isolation and Polymerase Chain Reaction

A. DNA Isolation

For M. tuberculosis strains obtained from Mayo Clinic samples, DNA was extracted from cells using phenol (Boehringer Mannheim, Indianapolis, Ind. 46250-0414) and TE (1.0 M Tris HCI pH 8.0, 0.1M EDTA, Sigma, St. Louis, Mo. 63778) in a ratio of 600 μl:400 μl and 0.1 mm zirconium beads (Biospec Products, Bartlesville, Okla. 74005). The mixture was processed in a mini-bead beater for 30 seconds and allowed to stand for an additional 15 minutes. Following a brief centrifugation to sediment the zirconium beads, DNA in the supernatant was extracted using the IsoQuick kit (MicroProbe Corp., Garden Grove, Calif. 92641).

B. PCR Using Primer Pairs A1-A4 and B1-B2

The DNA sequence for katG (EMBL no. X6808124) employed to design primers is depicted in FIG. 1(A-D), lower strand. The PCR method of R. K. Saiki et al., Science, 239, 487 (1988) was used to amplify the katG gene (ca. 2220 base pairs) in two segments which were designated A and B. Genomic DNA preparations (2 μl) were used with primers A1 (5' TCGGACCATAACGGCTTCCTGTTGGACGAG 3') (SEQ ID NO:3) and A4 (5' AATCTGCTTCGCCGACGAGGTCGTGCTGAC 3') (SEQ ID NO:4) or B1 (5' CACCCCGACGAAATGGGACAACAGTTTCCT 3') (SEQ ID NO:5) and B2 (5' GGGTCTGACAAATCGCGCCGGGCAAACACC 3') (SEQ ID NO:6).

The PCR mixture (50 μl) contained 10 mM TRIS, pH 8.3, 50 mM KCI, 1.5 mM MgCl₂,0.2 mM each of dATP, dTTP, dGTP, dCTP, 1 μM of each primer pair, 10% glycerol, 1.25 units/50 μl AmpliTaq DNA polymerase (Perkin Elmer Cetus). The mixture was overlaid with mineral oil and subjected to 4 min at 95° C. followed by 50 cycles of 1 min at 94° C. and 2 min at 74° C. A 1495 base pair product from the first half of katG was generated from the A1-A4 primers and 1435 base pair product was generated with the B1-B2 primer pair.

EXAMPLE 2 DNA Sequencing and Homology Analysis

The polymerase chain reaction (PCR) products were prepared for sequencing using the Magic™ PCR Preps DNA Purification System (Promega Corp., Madison, Wis. 53711). The DNA sequences were determined in both directions using the Taq dye-deoxy terminator cycle sequencing kit and 373A DNA sequencer (Applied Biosystems, Foster City, Calif. 94404) using a series of internal sequencing primers which provided appropriate coverage of katG.

The sequence data were analyzed using version 7 of the Genetics Computer Group sequence analysis software, as disclosed by J. Devereux et al., Nucl. Acids Res., 12, 387 (1984). From the 15 M. tuberculosis DNA sequences, a consensus sequence was derived to which all M. tuberculosis strains were compared. This consensus sequence is depicted in FIG. 1(A-D) (SEQ ID NO:1) as the upper strand, and is compared to the sequence for katG (EMBL no. X6808124), depicted as the lower strand. The two sequences have 98.6% identity, as determined by the GCG program BESTFIT. The DNA sequence data has been submitted to Gen Bank and can be referenced by the accession numbers UO6262 (H37Rv MC), UO6258 (ATCC 25618), UO6259 (ATCC 27294), UO6260 (G6108), UO6261 (H35827), UO6270 (L6627-92), UO6271 (L68372), UO6264 (L11150), UO6268 (L24204), UO6269 (L33308), UO6265 (Li6980), UO6266 (L1781), UO6272 (TMC306), UO6263 (L10373), and UO6267 (L23261). An updated, more complete and accurate M. tuberculosis katG gene sequence is presented in FIG. 7(A-C) (SEQ ID NO:20).

The DNA data was then translated, aligned for comparison and a consensus amino acid sequence was generated (FIG. 2) (SEQ ID NO:7). The consensus amino acid sequence (SEQ ID NO:21) generated from the DNA of SEQ ID NO:20 is also presented in FIG. 7.

In general, the overall sequence agreement between INH sensitive and resistant strains was very high; the only deviations are those shown in Table 3.

                                      TABLE 3     __________________________________________________________________________     Analysis of Catalase-Peroxidase (katG) Gene in M. tuberculosis Strains            INH            MIC.sup.a            (μg/ml)                Amino Acid Codon.sup.b     Strain INH Catalase                     2  10  17 90 224                                     243                                        264                                           315 337                                                  424 463                                                         505 550                                                                609     __________________________________________________________________________     H37Rv MC            <0.12                20     ATCC 25618            <0.12                12     ATCC 27294            0.12                28   P-S    S-N                               Q-E                                  A-S                        A-D     G6108  <0.12                12                                              M-I     H35827 0.25                14     L6627-92            0.5 13     L68372 1   8                              Y-C    R-L     L11150 8   28     L24204 8   36                         S-T        R-L     L33308 8   15     L16980 16  15                         S-T        R-L     L1781  32  5                       A-T           R-L     TMC 306            >32 5              W*.sup.c     L10373 >32 5       8 bpd.sup.d               A-V        A-D     L23261 >32 5                                     R-L                                                         W-R    M-I                Consensus                     P      S  W  Q  A  A  S   Y  A   R  W   A  M     __________________________________________________________________________      .sup.A MIC denotes Maximum Inhibitory Concentration, INH denotes isoniazi      .sup.B A denotes alanine, C cysteine, D aspartic acid, E glutamic acid, F      phenylalanine, G glycine, I isoleucine, K lysine, L leucine, M methionine      N asparagine, P proline, Q glutamine, R arginine, S serine, T threonine,      valine, W tryptophan, Y tryosine, B bpd B base pair deletion      .sup.c TGG→TGA(W→stop codon)      .sup.d 8 base pair deletion corresponding to wild type coordinates 98-105      creates a new TAG stop codon beginning 11 bp from coordinate 97.

The data in Table 3 show that six strains, H37Rv MC, ATCC 25618, H35827, L6627-92, L11150, and L33308, are completely homologous to the consensus at the indicated sites. Four are INH sensitive (INH IC_(min) <1.0 μg/ml) and two are INH resistant (IC_(min) ≧1.0 μg/ml). All other strains listed in Table 3 had 1 to 5 differences with the consensus and there was no strong correlation between the number of differences and INH sensitivity.

In the group of INH resistant strains, the most frequent change observed was the conversion of arginine at codon 463 to leucine. This was detected in five of nine isolates examined. There was not a consistent correlation between the loss of catalase activity and INH resistance since strains L11150 and L24204 had high levels of enzymatic activity, yet were INH resistant. Moreover, several other INH resistant strains showed catalase activity near the mean activity (16.5 mm) of the sensitive strains. Two other isolates had lost the ability to make normal katG gene product due either to an eight bp deletion (L10373, semiquantitative catalase, 3mm) or a nonsense mutation (TMC 306, semiquantitative catalase 5 mm). It was not possible to determine if, or how, any of the deviations from the consensus reported in Table 3 affect catalase activity or cause INH resistance. However, the change at codon 463 is frequent enough that is indicative of resistance.

The DNA sequence analysis indicated that the codon 463 occurs in the context of an NciI-MspI restriction site (both enzymes recognize the same site). Thus, when in the wild type sequence depicted in FIG. 1 at bases 1455-1458, CCGGG, is changed to CCTGG, it is no longer recognized (or cleaved) by either of these enzymes. The 1435 bp amplicon produced from the half of katG gene containing codon 463 normally has five NciI-MspI restriction sites whereas the codon altered strains have only four sites, as shown in FIG. 3. The loss of the site in question causes a unique restriction fragment length polymorphism (RFLP), which can be readily adapted to assay for resistant strains, as described in Example 3, below.

EXAMPLE 3 RFLP Analysis: MspI-NciI site in Codon 463

For restriction fragment length polymorphism (RFLP) analysis, a 1435 base pair amplimer (produced using the B1-B2 primers) representing the 3' half of the katG gene was generated using PCR and then digested with NciI or MspI (Sigma Chemical Co., St. Louis, Mo. 63178). The gene fragments were analyzed with agarose gel electrophoresis using 2% Metaphor agarose (FMC BioProducts, Richland, Me. 04811). The gel was stained with ethidium bromide and photographed. The investigator who performed all restriction digests and electrophoresis was blinded as to the INH IC_(min) results.

The results of this experiment are depicted in FIG. 4, wherein Lane 1 denotes strain H37Rv MC, IC_(min) =<0.12 μg/ml; (2) L6627-92, 0.5 μg/mL; (3) L68372, 1.0 μg/ml; (4) L16980, 16 μg/mL; (5) L39791, 16 μg/mL; (6) L1781, 32 μg/mL; (7) L9118, 4 μg/mL; (8) L11150, 8 μg/mL; (9) L24204, 8 μg/mL; (10) L68858,<0.12 μg/mL; (11)1115A<0.12 μg/mL; (12) L23261,>32 μg/mL; (13) 1341,>32 μg/mL; (14) M10838,>32 μg/mL; (15) molecular weight standard: PCR markers (United States Biochemical Corp., Cleveland, Ohio 44122). The digests obtained from resistant strains can be readily visually detected and differentiated from digests from susceptible strains.

Subsequently, a total of 75 M. tuberculosis strains (including the 15 strains sequenced) were analyzed for their loss of the appropriate restriction site. Of these strains, 32 were INH sensitive and 43 were INH resistant. The data showed that 19 (44%) of the 43 resistant strains had lost the expected restriction site in codon 463. One of the 33 (2.9%) sensitive strains had lost this restriction sites as well. None of the six sensitive strains listed in Table 3 lost this site.

EXAMPLE 4 Determination of the Presence or Absence of Mutations at Codons 264, 315, 337 or 463 in the M. tuberculosis katG Gene

A. Materials

Primer pairs used for polymerase chain reaction were katG904/katG1523 (nucleotide sequences 5' AGC TCG TAT GGC ACC GGA AC 3' (SEQ ID NO:16) and 5' TTG ACC TCC CAC CCG ACT TG 3' (SEQ ID NO:17)) and katG633/katG983 (nucleotide sequences 5' CGG TAA GCG GGA TCT GGA GA 3' (SEQ ID NO:18) and 5' CAT TTC GTC GGG GTG TTC GT 3' (SEQ ID NO:19)). Subunits thereof that hybridize to the amplified DNA under the conditions described hereinbelow may also be used.

Polyacrylamide was obtained from National Diagnostics, Tris Borate EDTA solution (6X, cat. no. T6400), magnesium chloride, and dithiothreitol (DTT) from Sigma Chemical Company (St. Louis, Mo.), TEMED (cat. no. 161-0800) and ethidium bromide (EtBr) from Biorad, ammonium persulfate from Intermountain Sci., and nucleotides (dATP, dGTP, dCTP and dTTP, 100 mM solutions) from Boehringer Mannheim Biochemicals. dUTP was obtained from Pharmacia. AmpErase™ uracil-N-glycosylase (UNG) and AmpliTaq™ were obtained from Perkin Elmer.

Restriction endonucleases were obtained as follows: MspI from Sigma (cat. no. R-4506) 10 u/μl with blue palette buffer; RsaI from New England Biochemical (cat. no. 167S) 10 u/μl with NEB buffer 1; and CfoI from Promega (cat. no. R624) 10 u/μl with buffer B.

100 mM nucleotide concentrates obtained from Boehringer Mannheim Biochemicals were used to make the dNTP stock solution, which was 1.25 mM in each nucleotide. Specifically, 10 μl each of dATP, dGTP, dCTP, and dTTP concentrates were added to 760 μl water. dNTP(U) stock solution, also 1.25 mM in each nucleotide, was made from the same 100 mM dATP, dGTP and dCTP concentrates, and 100 mM dUTP concentrate from Pharmacia. Ten μl of each of the four concentrates was added to 760 μl water to make the stock solution.

10X PCR buffer consisted of 100 mM Tris, pH 8.3, 500 mM KCI, and 15 mM MgCl2. PCR mix "A" consisted of 1X PCR buffer, 200 μM each dATP, dGTP, dCTP, and dUTP, 1 μM each katG904 (SEQ ID NO:16) and katG1523 (SEQ ID NO: 17) primers, 10% glycerol, 10 units/ml AmpErase™UNG, and 0.025 units/μl AmpliTaq. PCR mix B consisted of 1X PCR buffer, 200 μM each dATP, dGTP, dCTP, and dTTP, 1 μM each katG904 (SEQ ID NO:16) and katG1523 (SEQ ID NO:17) primers, 10% glycerol, and 0.025 units/μl AmpliTaq. PCR mix "C" consisted of 1X PCR buffer, 200 μM each dATP, dGTP, dCTP, and dUTP, 1 μM each katG633 (SEQ ID NO:18) and katG983 (SEQ ID NO:19) primers, 10% glycerol, 10 units/ml AmpErase™UNG, and 0.025 units/μl AmpliTaq.

Gel loading solution (Blue Juice) was obtained from Sigma (cat. no. G2526). Gels were photographed on a UV transilluminator (UVP) with Polaroid 667 black and white film (31/4X 41/4 inch) through an orange filter.

DNA extracts (target DNA) were prepared as described in Example 1(A).

B. MspI RFLP analysis

PCR was performed by adding 2 μl of DNA extract to 48 μl PCR mix "A." Each reaction was covered with 2 drops of mineral oil. Temperature was cycled (Perkin Elmer DNA Thermo Cycler model 480) for 1 cycle of (5'-37°; 5'-95°) and 40 cycles of (1'-94°; 0.5'-60°; 0.75'-72°) and a 72° soak. MspI (10 u/μl) was diluted 1:10 in 100 mM MgCl₂. The amplified DNA (base pairs 904 through 1523) of a wild-type katG gene contains 7 MspI restriction sites (FIG. 5); of the 8 fragments produced in an MspI restriction digest, 4 are of sufficient length to be visualized using gel electrophoresis (see FIG. 5 for a restriction map). Diluted MspI (1 μl) was mixed with 9 μl of the PCR reaction mixture containing the amplified DNA. The digest was incubated at 37° C. for 2 hours, then heated to 65° C. for 10 minutes. Subsequently, 10 μl of the digest plus 4 μl blue juice was electrophoresed on 6% polyacrylamide for 0.4 hour at 200 V. The gel was stained in EtBr (0.5 mg/ml 1XTBE) for 5 minutes and photographed.

Results are shown in FIG. 8. Lanes C, D, F, G, H, K, L, N, and Q show the wild-type genotype at codons 315 (AGC) and 463 (CGG) evidenced by 4 restriction of sufficient length to be visualized using gel electrophoresis (228, 153, 137, and 65 base pairs, respectively, see FIG. 5). Lanes M and 0 show an RFLP indicating a mutation at codon 315 that adds a new MspI restriction site, causing the 153 base pair fragment to be shortened to 132 base pair and become difficult to resolve from the 137 base pair fragment. The resulting 3 fragment pattern (65, 132/137 and 228 base pair) is indicative of an INH resistant strain. Lanes E, I and P show an RFLP indicating a mutation at codon 463 that eliminates an MspI restriction site, evidenced by the 3 visible fragments produced by cleavage versus the 4 produced by the wild-type genotype. The resulting 3 fragment pattern (153, 202, and 228 base pair) is indicative of an INH resistant strain. Lanes B and J show an RFLP indicating mutations at both codon 315 and codon 463. The resulting gain and loss of MspI restriction sites produces a distinctive 3 fragment RFLP pattern (132, 202 and 228 base pair) indicative of an INH resistant strain (see FIG. 5 for a restriction map).

C. RsaI RFLP analysis

PCR was performed by adding 2 μl of DNA extract to 48 μl PCR mix "B." Each reaction was covered with 2 drops of mineral oil. Temperature was cycled (Perkin Elmer DNA Thermo Cycler model 480) for 1 cycle of 2 minutes at 94° and 40 cycles of (1'-94°; 0.5'-60°; 0.75'-72°) and a 4° soak. RsaI (10 u/μl) was diluted 1:20 in 100 mM MgCl₂ /100 mM dithiothreitol. The amplified DNA (bases 904 through 1523) of a wild-type katG gene contains 2 RsaI restriction sites. Diluted RsaI (2 μl) was placed on top of the PCR reaction mixture (on the oil) and centrifuged at about 12,000×g for 10 seconds to drop the RsaI enzyme into the mixture containing the amplified DNA. The resulting mixture was incubated overnight (15-20 hours) at 37°, after which 10 μl of the digest plus 1 μl blue juice was electrophoresed on 6% polyacrylamide for 0.4 hour at 200 V. The gel was stained in EtBr (0.5 mg/ml 1XTBE) for 5 minutes and photographed.

Results are shown in FIG. 9. Lanes A, B, D, E, and F show the wild-type genotype at codon 337 (TAC), evidenced by three restriction fragments produced by cleavage at two sites. Lane C shows an RFLP indicating a mutation at codon 337 that eliminates one of the RsaI restriction sites. The resulting two fragment pattern has been observed in an INH resistant strain.

D. CfoI RFLP analysis.

PCR was performed by adding 2 μl of DNA extract to 48 μl PCR mix "C." Each reaction was covered with 2 drops of mineral oil. Temperature was cycled (Perkin Elmer DNA Thermo Cycler model 480) for 1 cycle of (5'-37°; 5'-95°) and 40 cycles of (1'-94°; 0.5'-60°; 0.75'-72°) and a 72° soak. The amplified DNA (bases 633 through 983) of a wild-type katG gene contains 3 CfoI restriction sites. CfoI (10 u/μl) was diluted 1:5 in 100 mM MgCl₂. Diluted CfoI (1 μl) was mixed with 9 μl of the PCR reaction mixture containing the amplified DNA. The digest was incubated at 37° for 2 hours, then heated to 65° C. for 10 minutes. Subsequently, 10 μl of the digest plus 4 μl blue juice was electrophoresed on 6% polyacrylamide for 0.4 hour at 200 V. The gel was stained in EtBr (0.5 mg/ml 1XTBE) for 5 minutes and photographed.

Results are shown in FIG. 10. Lanes A-C show the wild-type genotype at codon 264 (GCG), evidenced by 4 restriction fragments produced by cleavage at three sites. Lane E shows an RFLP indicating a mutation at codon 264 that eliminates one of the CfoI restriction sites. The resulting three fragment pattern has been observed in an INH resistant strain.

EXAMPLE 5 Rapid Simultaneous Detection of M. tuberculosis (MTB) and Determination of Isoniazid (INH) Susceptibility Directly from Sputum

Five microliter aliquots of 785 ethanol-fixed sputum samples from 365 patients were screened for MTB and INH resistance using the MspI RFLP analysis disclosed in Example 4(B). Primers katG904 (SEQ ID NO:16) and katG1523 (SEQ ID NO:17) were used in a polymerase chain reaction as described in Example 4 to produce a 620 base pair katG gene fragment or "amplicon" (base pairs 904 through 1523), which was digested with MspI. The resulting restriction fragment pattern was visualized using gel electrophoresis. A result considered "positive for MTB" was defined as production of a katG amplicon that generated an RFLP pattern (see FIG. 8) indicative of wild-type MTB or mutant MTB (MTB containing a S315T or R463L mutation, or both, in the 620 base pair katG amplicon). A negative result was defined as the failure to produce a katG amplicon (i.e., failure to produce the 620 base pair segment in the PCR), or, in rare cases, the production of a katG amplicon followed by generation of an RFLP pattern that differed from the RFLP pattern know to be associated with wild-type or mutant (S315T, R463L or S315T/R463L) MTB.

The results of this PCR-RFLP assay were compared to results for acid-fast bacilli (AFB) staining by the Ziehl-Neelsen method, and to results of culture and INH susceptibility testing using the BACTEC radiometric method. Technologists performed PCR-RFLP after AFB bacilli stains and cultures were done, and were blinded to the results for AFB stains and cultures. Patient charts were also reviewed for clinical correlation.

Seventy of 785 (8.9%) sputa were AFB stain-positive. MTB was cultured from 48 of these 70 samples. For 9 other AFB stain-positive samples, MTB was not isolated, however MTB was isolated from these same patients from a recent prior sputum. Eight of these 9 patients were receiving antituberculous therapy at the time the sputum was collected for the study.

Mycobacteria other than MTB (MOTT) were exclusively cultured from 9 other AFB stain-positive specimens M. avium intracellulare (6), M. fortuitum (2), M. kansasii (1)!. No mycobacteria (MTB or MOTT) were cultured from 2 AFB stain-positive sputa obtained from two patients whose recent prior sputa did not grow mycobacteria. No clinical laboratory information (including whether prior cultures for mycobacteria were done) was available for the 2 remaining AFB stain-positive but culture-negative sputa.

The results for the PCR-RFLP were positive for MTB for 45 of the 48 AFB stain-positive samples that grew MTB. Two of these 3 discordant samples had few AFB on stain and 1 had rare AFB on stain. Additional 5 microliter aliquots from these 3 discordant samples were tested and produced positive results for PCR-RFLP. For the 9 samples that were AFB stain-positive, MTB culture-negative, and where recent prior samples from the same patient were MTB culture-positive, PCR-RFLP results were positive for MTB. In no case was a katG amplicon generated for the 9 samples from which MOTT were recovered on culture. PCR-RFLP results were also negative for MTB for the 4 remaining AFB stain-positive, culture-negative samples.

For AFB stain-positive samples from which MTB was isolated and for which susceptibility testing was performed (n=45), 7 isolates (15.6%) were INH resistant. All of these INH resistant isolates were detected by the PCR-RFLP and analysis of the RFLP pattern showed them to carry the S315T mutation. These 7 isolates were from 7 different patients and 4 had different susceptibility patterns for other drugs.

The PCR-RFLP method produced a katG amplicon for 39 of the 715 AFB stain-negative samples. MTB was cultured from only 3 of these samples. However, for 21 of the 39 samples, although MTB was not cultured, MTB was recovered from recent prior samples from the same patient and/or the patient was receiving antituberculous therapy at the time the study sample was collected. Comprehensive clinical and laboratory chart reviews were not available for the remaining 15 patients.

This PCR-RFLP MTB katG assay, which can be performed in one working day, is thus a reliable, rapid method for detecting MTB and determining INH susceptibility directly from AFB stain-positive sputa.

All publications, patents and patent documents are incorporated by reference herein, as though individually incorporated by reference. The invention has been described with reference to various specific and preferred embodiments and techniques. However, it should be understood that many variations and modifications may be made while remaining within the spirit and scope of the invention.

    __________________________________________________________________________     #             SEQUENCE LISTING     - (1) GENERAL INFORMATION:     -    (iii) NUMBER OF SEQUENCES: 22     - (2) INFORMATION FOR SEQ ID NO:1:     -      (i) SEQUENCE CHARACTERISTICS:     #pairs    (A) LENGTH: 2235 base               (B) TYPE: nucleic acid               (C) STRANDEDNESS: single               (D) TOPOLOGY: linear     -     (ii) MOLECULE TYPE: DNA     -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:1:     - AGGAATGCTG TGCCCGAGCA ACACCCACCC ATTACAGAAA CCACCACCGG AG - #CCGCTAGC       60     - AACGGCTGTC CCGTCGTGGG TCATATGAAA TACCCCGTCG AGGGCGGCGG AA - #ACCAGGAC      120     - TGGTGGCCCA ACCGGCTCAA TCTGAAGGTA CTGCACCAAA ACCCGGCCGT CG - #CTGACCCG      180     - ATGGGTGCGG CGTTCGACTA TGCCGCGGAG GTCGCGACCA TCGACGTTGA CG - #CCCTGACG      240     - CGGGACATCG AGGAAGTGAT GACCACCTCG CAGCCGTGGT GGCCCGCCGA CT - #ACGGCCAC      300     - TACGGGCCGC TGTTTATCCG GATGGCGTGG CACGCTGCCG GCACCTACCG CA - #TCCACGAC      360     - GGCCGCGGCG GCGCCGGGGG CGGCATGCAG CGGTTCGCGC CGCTTAACAG CT - #GGCCCGAC      420     - AACGCCAGCT TGGACAAGGC GCGCCGGCTG CTGTGGCCGG TCAAGAAGAA GT - #ACGGCAAG      480     - AAGCTCTCAT GGGCGGACCT GATTGTTTTC GCCGGCAACT GCGCGCTGGA AT - #CGATGGGC      540     - TTCAAGACGT TCGGGTTCGG CTTCGGCCGG GTCGACCAGT GGGAGCCCGA TG - #AGGTCTAT      600     - TGGGGCAAGG AAGCCACCTG GCTCGGCGAT GAGCGTTACA GCGGTAAGCG GG - #ATCTGGAG      660     - AACCCGCTGG CCGCGGTGCA GATGGGGCTG ATCTACGTGA ACCCGGAGGG GC - #CGAACGGC      720     - AACCCGGACC CCATGGCCGC GGCGGTCGAC ATTCGCGAGA CGTTTCGGCG CA - #TGGCCATG      780     - AACGACGTCG AAACAGCGGC GCTGATCGTC GGCGGTCACA CTTTCGGTAA GA - #CCCATGGC      840     - GCCGGCCCGG CCGATCTGGT CGGCCCCGAA CCCGAGGCTG CTCCGCTGGA GC - #AGATGGGC      900     - TTGGGCTGGA AGAGCTCGTA TGGCACCGGA ACCGGTAAGG ACGCGATCAC CA - #GCGGCATC      960     - GAGGTCGTAT GGACGAACAC CCCGACGAAA TGGGACAACA GTTTCCTCGA GA - #TCCTGTAC     1020     - GGCTACGAGT GGGAGCTGAC GAAGAGCCCT GCTGGCGCTT GGCAATACAC CG - #CCAAGGAC     1080     - GGCGCCGGTG CCGGCACCAT CCCGGACCCG TTCGGCGGGC CAGGGCGCTC CC - #CGACGATG     1140     - CTGGCCACTG ACCTCTCGCT GCGGGTGGAT CCGATCTATG AGCGGATCAC GC - #GTCGCTGG     1200     - CTGGAACACC CCGAGGAATT GGCCGACGAG TTCGCCAAGG CCTGGTACAA GC - #TGATCCAC     1260     - CGAGACATGG GTCCCGTTGC GAGATACCTT GGGCCGCTGG TCCCCAAGCA GA - #CCCTGCTG     1320     - TGGCAGGATC CGGTCCCTGC GGTCAGCCAC GACCTCGTCG GCGAAGCCGA GA - #TTGCCAGC     1380     - CTTAAGAGCC AGATCCGGGC ATCGGGATTG ACTGTCTCAC AGCTAGTTTC GA - #CCGCATGG     1440     - GCGGCGGCGT CGTCGTTCCG TGGTAGCGAC AAGCGCGGCG GCGCCAACGG TG - #GTCGCATC     1500     - CGCCTGCAGC CACAAGTCGG GTGGGAGGTC AACGACCCCG ACGGGGATCT GC - #GCAAGGTC     1560     - ATTCGCACCC TGGAAGAGAT CCAGGAGTCA TTCAACTCCG CGGCGCCGGG GA - #ACATCAAA     1620     - GTGTCCTTCG CCGACCTCGT CGTGCTCGGT GGCTGTGCCG CCATAGAGAA AG - #CAGCAAAG     1680     - GCGGCTGGCC ACAACATCAC GGTGCCCTTC ACCCCGGGCC GCACGGATGC GT - #CGCAGGAA     1740     - CAAACCGACG TGGAATCCTT TGCCGTGCTG GAGCCCAAGG CAGATGGCTT CC - #GAAACTAC     1800     - CTCGGAAAGG GCAACCCGTT GCCGGCCGAG TACATGCTGC TCGACAAGGC GA - #ACCTGCTT     1860     - ACGCTCAGTG CCCCTGAGAT GACGGTGCTG GTAGGTGGCC TGCGCGTCCT CG - #GGCAAACT     1920     - ACAAGCGCTT ACCGCTGGGC GTGTTCACCG AGGCCTCCGA GTCACTGACC AA - #CGACTTCT     1980     - TCGTGAACCT GCTCGACATG GGTATCACCT GGGAGCCCTC GCCAGCAGAT GA - #CGGGACCT     2040     - ACCAGGGCAA GGATGGCAGT GGCAAGGTGA AGTGGACCGG CAGCCGCGTG GA - #CCTGGTCT     2100     - TCGGGTCCAA CTCGGAGTTG CGGGCGCTTG TCGAGGTCTA TGGCGCCGAT GA - #CGCGCAGC     2160     - CGAAGTTCGT GCAGGACTTC GTCGCTGCCT GGGACAAGGT GATGAACCTC GA - #CAGGTTCG     2220     #  2235     - (2) INFORMATION FOR SEQ ID NO:2:     -      (i) SEQUENCE CHARACTERISTICS:     #pairs    (A) LENGTH: 2221 base               (B) TYPE: nucleic acid               (C) STRANDEDNESS: single               (D) TOPOLOGY: linear     -     (ii) MOLECULE TYPE: DNA     -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:2:     - AGGAATGCTG TGCCCGAGCA ACACCCACCC ATTACAGAAA CCACCACCGG AG - #CCGCTAGC       60     - AACGGCTGTC CCGTCGTGGG TCATATGAAA TACCCCGTCG AGGGCGGCGG AA - #ACCAGGAC      120     - TGGTGGCCCA ACCGGCTCAA TCTGAAGGTA CTGCACCAAA ACCCGGCCGT CG - #CTGACCCG      180     - ATGGGTGCGG CGTTCGACTA TGCCGCGGAG GTCGCGACCA GTCGACTTGA CG - #CCCTGACG      240     - CGGGACATCG AGGAAGTGAT GACCACCTCG CAGCCGTGGT GGCCCGCCGA CT - #ACGGCCAC      300     - TACGGGCCGC TGTTTATCCG GATGGCGTGG CACGCTGCCG GCACCTACCG CA - #TCCACGAC      360     - GGCCGCGGCG GCGCCGGGGG CGGCATGCAG CGGTTCGCGC CGCTTAACAG CT - #GGCCCGAC      420     - AACGCCAGCT TGGACAAGGC GCGCCGGCTG CTGTGGCCGG TCAAGAAGAA GT - #ACGGCAAG      480     - AAGCTCTCAT GGGCGGACCT GATTGTTTTC GCCGGCAACC GCTGCGCTCG GA - #ATCGATGG      540     - GCTTCAAGAC GTTCGGGTTC GGCTTCGGGC GTCGACCAGT GGGAGACCGA TG - #AGGTCTAT      600     - TGGGGCAAGG AAGCCACCTG GCTCGGCGAT GACGGTTACA GCGTAAGCGA TC - #TGGAGAAC      660     - CCGCTGGCCG CGGTGCAGAT GGGGCTGATC TACGTGAACC CGGAGGCGCC GA - #ACGGCAAC      720     - CCGGACCCCA TGGCCGCGGC GGTCGACATT CGCGAGACGT TTCGGCGCAT GG - #CCATGAAC      780     - GACGTCGAAA CAGCGGCGCT GATCGTCGGC GGTCACACTT TCGGTAAGAC CC - #ATGGCGCC      840     - GGCCCGGCCG ATCTGGTCGG CCCCGAACCC GAGGCTGCTC CGCTGGAGCA GA - #TGGGCTTG      900     - GGCTGGAAGA GCTCGTATGG CACCGGAACC GGTAAGGACG CGATCACCAG CG - #GCATCGAG      960     - GTCGTATGGA CGAACACCCC GACGAAATGG GACAACAGTT TCCTCGAGAT CC - #TGTACGGC     1020     - TACGAGTGGG AGCTGACGAA GAGCCCTGCT GGCGCTTGGC AATACACCGC CA - #AGGACGGC     1080     - GCCGGTGCCG GCACCATCCC GGACCCGTTC GGCGGGCCAG GGCGCTCCCC GA - #CGATGCTG     1140     - GCCACTGACC TCTCGCTGCG GGTGGATCCG ATCTATGAGC GGATCACGCG TC - #GCTGGCTG     1200     - GAACACCCCG AGGAATTGGC CGACGAGTTC CGCAAGGCCT GGTACAAGCT GA - #TCCACCGA     1260     - GACATGGGTC CCGTTGCGAG ATACCTTGGG CCGCTGGTCC CCAAGCAGAC CC - #TGCTGTGG     1320     - CAGGATCCGG TCCCTGCGGT CAGCACGACC TCGTCGGCGA AGCAGATTGC CA - #GCCTTAAG     1380     - AGCCAGATCC GGGCATCGGG ATTGACTGTC TCACAGCTAG TTTCGACCGC AT - #GGGCGGCG     1440     - GCGTCGTCGT TCCGTGGTAG CGACAAGCGC GGCGGCGCCA ACGGTGGTCG CA - #TCCGCCTG     1500     - CAGCCACAAG TCGGGTGGGA GGTCAACGAC CCCGACGGAT CTGCGCAAGG TC - #ATTCGCAC     1560     - CCTGAAGAGA TCCAGGAGTC ATTCACTCGG CGCGGGAACA TCAAAGTGTC CT - #TCGCCGAC     1620     - CTCGTCGTGC TCGGTGGCTG TGCGCCACTA GAGAAAGCAG CAAAGGCGGC TG - #GCCACAAC     1680     - ATCACGGTGC CCTTCACCCC GGGCCCGCAC GATGCGTCGC AGGAACAAAC CG - #ACGTGGAA     1740     - TCCTTTGCCG TGCTGGAGCC CAAGGCAGAT GGCTTCCGAA ACTACCTCGG AA - #AGGGCAAC     1800     - CGTTGCCGGC CGAGTACATC GCTGCTCGAC AAGGCGAACC TGCTTACGCT CA - #GTGCCCCT     1860     - GAGATGACGG TGCTGGTAGG TGGCCTGCGC GTCCTCGGCG CAAACTACAA GC - #GCTTACCG     1920     - CTGGGCGTGT TCACCGAGGC CTCCGAGTCA CTGACCAACG ACTTCTTCGT GA - #ACCTGCTC     1980     - GACATGGGTA TCACCTGGGA GCCCTCGCCA GCAGATGACG GGACCTACCA GG - #GCAAGGAT     2040     - GGCAGTGGCA AGGTGAAGTG GACCGGCAGC CGCGTGGACC TGGTCTTCGG GT - #CCAACTCG     2100     - GAGTTGCGGG CGCTTGTCGA GGTCTATGCG CCGATGACGC GGCAGGCGAA GT - #TCGTGACA     2160     - GGATTCGTCG CTGCGTGGGA CAAGGTGATG AACCTCGACA GGTTCGACGT GC - #GCTGATTC     2220     #             2221     - (2) INFORMATION FOR SEQ ID NO:3:     -      (i) SEQUENCE CHARACTERISTICS:     #pairs    (A) LENGTH: 30 base               (B) TYPE: nucleic acid               (C) STRANDEDNESS: single               (D) TOPOLOGY: linear     -     (ii) MOLECULE TYPE: DNA     -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:3:     #           30     TCCT GTTGGACGAG     - (2) INFORMATION FOR SEQ ID NO:4:     -      (i) SEQUENCE CHARACTERISTICS:     #pairs    (A) LENGTH: 30 base               (B) TYPE: nucleic acid               (C) STRANDEDNESS: single               (D) TOPOLOGY: linear     -     (ii) MOLECULE TYPE: DNA     -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:4:     #           30     GAGG TCGTGCTGAC     - (2) INFORMATION FOR SEQ ID NO:5:     -      (i) SEQUENCE CHARACTERISTICS:     #pairs    (A) LENGTH: 30 base               (B) TYPE: nucleic acid               (C) STRANDEDNESS: single               (D) TOPOLOGY: linear     -     (ii) MOLECULE TYPE: DNA     -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:5:     #           30     GACA ACAGTTTCCT     - (2) INFORMATION FOR SEQ ID NO:6:     -      (i) SEQUENCE CHARACTERISTICS:     #pairs    (A) LENGTH: 30 base               (B) TYPE: nucleic acid               (C) STRANDEDNESS: single               (D) TOPOLOGY: linear     -     (ii) MOLECULE TYPE: DNA     -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:6:     #           30     GCCG GGCAAACACC     - (2) INFORMATION FOR SEQ ID NO:7:     -      (i) SEQUENCE CHARACTERISTICS:     #acids    (A) LENGTH: 740 amino               (B) TYPE: amino acid               (C) STRANDEDNESS: single               (D) TOPOLOGY: linear     -     (ii) MOLECULE TYPE: protein     -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:7:     -      Val Pro Glu Gly His Pro Pro Ile - # Thr Glu Thr Thr Thr Gly Ala     Ala     #   15     -      Ser Asn Gly Cys Pro Val Val Gly - # His Met Lys Tyr Pro Val Glu     Gly     #                 30     -      Gly Gly Asn Gln Asp Trp Trp Pro - # Asn Arg Leu Asn Leu Lys Val     Leu     #             45     -      His Gln Asn Pro Ala Val Ala Asp - # Pro Met Gly Ala Ala Phe Asp     Tyr     #         60     -      Ala Ala Glu Val Ala Thr Ile Asp - # Val Asp Ala Leu Thr Arg Asp     Ile     #     80     -      Glu Glu Val Met Thr Thr Ser Gln - # Pro Trp Trp Pro Ala Asp Tyr     Gly     #   95     -      His Tyr Gly Pro Leu Phe Ile Arg - # Met Ala Trp His Ala Ala Gly     Thr     #                110     -      Tyr Arg Ile His Asp Gly Arg Gly - # Gly Ala Gly Gly Gly Met Gln     Arg     #            125     -      Phe Ala Pro Leu Asn Ser Trp Pro - # Asp Asn Ala Ser Leu Asp Lys     Ala     #        140     -      Arg Arg Leu Leu Trp Pro Val Lys - # Lys Lys Tyr Gly Lys Lys Leu     Ser     #    160     -      Trp Ala Asp Leu Ile Val Phe Ala - # Gly Asn Cys Ala Leu Glu Ser     Met     #   175     -      Gly Phe Lys Thr Phe Gly Phe Gly - # Phe Gly Arg Val Asp Gln Trp     Glu     #                190     -      Pro Asp Glu Val Tyr Trp Gly Lys - # Glu Ala Thr Trp Leu Gly Asp     Glu     #            205     -      Arg Tyr Ser Gly Lys Arg Asp Leu - # Glu Asn Pro Leu Ala Ala Val     Gln     #        220     -      Met Gly Leu Ile Tyr Val Asn Pro - # Glu Gly Pro Asn Gly Asn Pro     Asp     #    240     -      Pro Met Ala Ala Ala Val Asp Ile - # Arg Glu Thr Phe Arg Arg Met     Ala     #   255     -      Met Asn Asp Val Glu Thr Ala Ala - # Leu Ile Val Gly Gly His Thr     Phe     #                270     -      Gly Lys Thr His Gly Ala Gly Pro - # Ala Asp Leu Val Gly Pro Glu     Pro     #            285     -      Glu Ala Ala Pro Leu Glu Gln Met - # Gly Leu Gly Trp Lys Ser Ser     Tyr     #        300     -      Gly Thr Gly Thr Gly Lys Asp Ala - # Ile Thr Ser Gly Ile Glu Val     Val     #    320     -      Trp Thr Asn Thr Pro Thr Lys Trp - # Asp Asn Ser Phe Leu Glu Ile     Leu     #   335     -      Tyr Gly Tyr Glu Trp Glu Leu Thr - # Lys Ser Pro Ala Gly Ala Trp     Gln     #                350     -      Tyr Thr Ala Lys Asp Gly Ala Gly - # Ala Gly Thr Ile Pro Asp Pro     Phe     #            365     -      Gly Gly Pro Gly Arg Ser Pro Thr - # Met Leu Ala Thr Asp Leu Ser     Leu     #        380     -      Arg Val Asp Pro Ile Tyr Glu Arg - # Ile Thr Arg Arg Trp Leu Glu     His     #    400     -      Pro Glu Glu Leu Ala Asp Glu Phe - # Ala Lys Ala Trp Tyr Lys Leu     Ile     #   415     -      His Arg Asp Met Gly Pro Val Ala - # Arg Tyr Leu Gly Pro Leu Val     Pro     #                430     -      Lys Gln Thr Leu Leu Trp Gln Asp - # Pro Val Pro Ala Val Ser His     Asp     #            445     -      Leu Val Gly Glu Ala Glu Ile Ala - # Ser Leu Lys Ser Gln Ile Arg     Ala     #        460     -      Ser Gly Leu Thr Val Ser Gln Leu - # Val Ser Thr Ala Trp Ala Ala     Ala     #    480     -      Ser Ser Phe Arg Gly Ser Asp Lys - # Arg Gly Gly Ala Asn Gly Gly     Arg     #   495     -      Ile Arg Leu Gln Pro Gln Val Gly - # Trp Glu Val Asn Asp Pro Asp     Gly     #                510     -      Asp Leu Arg Lys Val Ile Arg Thr - # Leu Glu Glu Ile Gln Glu Ser     Phe     #            525     -      Asn Ser Ala Ala Pro Gly Asn Ile - # Lys Val Ser Phe Ala Asp Leu     Val     #        540     -      Val Leu Gly Gly Cys Ala Ala Ile - # Glu Lys Ala Ala Lys Ala Ala     Gly     #    560     -      His Asn Ile Thr Val Pro Phe Thr - # Pro Gly Arg Thr Asp Ala Ser     Gln     #   575     -      Glu Gln Thr Asp Val Glu Ser Phe - # Ala Val Leu Glu Pro Lys Ala     Asp     #                590     -      Gly Phe Arg Asn Tyr Leu Gly Lys - # Gly Asn Pro Leu Pro Ala Glu     Tyr     #            605     -      Met Leu Leu Asp Lys Ala Asn Leu - # Leu Thr Leu Ser Ala Pro Glu     Met     #        620     -      Thr Val Leu Val Gly Gly Leu Arg - # Val Leu Gly Ala Asn Tyr Lys     Arg     #    640     -      Leu Pro Leu Gly Val Phe Thr Glu - # Ala Ser Glu Ser Leu Thr Asn     Asp     #   655     -      Phe Phe Val Asn Leu Leu Asp Met - # Gly Ile Thr Trp Glu Pro Ser     Pro     #                670     -      Ala Asp Asp Gly Thr Tyr Gln Gly - # Lys Asp Gly Ser Gly Lys Val     Lys     #            685     -      Trp Thr Gly Ser Arg Val Asp Leu - # Val Phe Gly Ser Asn Ser Glu     Leu     #        700     -      Arg Ala Leu Val Glu Val Tyr Gly - # Ala Asp Asp Ala Gln Pro Lys     Phe     #    720     -      Val Gln Asp Phe Val Ala Ala Trp - # Asp Lys Val Met Asn Leu Asp     Arg     #   735     -      Phe Asp Val Arg                      740     - (2) INFORMATION FOR SEQ ID NO:8:     -      (i) SEQUENCE CHARACTERISTICS:     #pairs    (A) LENGTH: 27 base               (B) TYPE: nucleic acid               (C) STRANDEDNESS: single               (D) TOPOLOGY: linear     -     (ii) MOLECULE TYPE: DNA     -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:8:     #             27   TGAT CGTCGGC     - (2) INFORMATION FOR SEQ ID NO:9:     -      (i) SEQUENCE CHARACTERISTICS:     #pairs    (A) LENGTH: 27 base               (B) TYPE: nucleic acid               (C) STRANDEDNESS: single               (D) TOPOLOGY: linear     -     (ii) MOLECULE TYPE: DNA     -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:9:     #             27   TGAT CGTCGGC     - (2) INFORMATION FOR SEQ ID NO:10:     -      (i) SEQUENCE CHARACTERISTICS:     #pairs    (A) LENGTH: 27 base               (B) TYPE: nucleic acid               (C) STRANDEDNESS: single               (D) TOPOLOGY: linear     -     (ii) MOLECULE TYPE: DNA     -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:10:     #             27   GCTA CGAGTGG     - (2) INFORMATION FOR SEQ ID NO:11:     -      (i) SEQUENCE CHARACTERISTICS:     #pairs    (A) LENGTH: 27 base               (B) TYPE: nucleic acid               (C) STRANDEDNESS: single               (D) TOPOLOGY: linear     -     (ii) MOLECULE TYPE: DNA     -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:11:     #             27   GCTA CGAGTGG     - (2) INFORMATION FOR SEQ ID NO:12:     -      (i) SEQUENCE CHARACTERISTICS:     #pairs    (A) LENGTH: 27 base               (B) TYPE: nucleic acid               (C) STRANDEDNESS: single               (D) TOPOLOGY: linear     -     (ii) MOLECULE TYPE: DNA     -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:12:     #             27   GCAT CGAGGTC     - (2) INFORMATION FOR SEQ ID NO:13:     -      (i) SEQUENCE CHARACTERISTICS:     #pairs    (A) LENGTH: 27 base               (B) TYPE: nucleic acid               (C) STRANDEDNESS: single               (D) TOPOLOGY: linear     -     (ii) MOLECULE TYPE: DNA     -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:13:     #             27   GCAT CGAGGTC     - (2) INFORMATION FOR SEQ ID NO:14:     -      (i) SEQUENCE CHARACTERISTICS:     #pairs    (A) LENGTH: 27 base               (B) TYPE: nucleic acid               (C) STRANDEDNESS: single               (D) TOPOLOGY: linear     -     (ii) MOLECULE TYPE: DNA     -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:14:     #             27   CATC GGGATTG     - (2) INFORMATION FOR SEQ ID NO:15:     -      (i) SEQUENCE CHARACTERISTICS:     #pairs    (A) LENGTH: 27 base               (B) TYPE: nucleic acid               (C) STRANDEDNESS: single               (D) TOPOLOGY: linear     -     (ii) MOLECULE TYPE: DNA     -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:15:     #             27   CATC GGGATTG     - (2) INFORMATION FOR SEQ ID NO:16:     -      (i) SEQUENCE CHARACTERISTICS:     #pairs    (A) LENGTH: 20 base               (B) TYPE: nucleic acid               (C) STRANDEDNESS: single               (D) TOPOLOGY: linear     -     (ii) MOLECULE TYPE: DNA     -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:16:     # 20               GAAC     - (2) INFORMATION FOR SEQ ID NO:17:     -      (i) SEQUENCE CHARACTERISTICS:     #pairs    (A) LENGTH: 20 base               (B) TYPE: nucleic acid               (C) STRANDEDNESS: single               (D) TOPOLOGY: linear     -     (ii) MOLECULE TYPE: DNA     -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:17:     # 20               CTTG     - (2) INFORMATION FOR SEQ ID NO:18:     -      (i) SEQUENCE CHARACTERISTICS:     #pairs    (A) LENGTH: 20 base               (B) TYPE: nucleic acid               (C) STRANDEDNESS: single               (D) TOPOLOGY: linear     -     (ii) MOLECULE TYPE: DNA     -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:18:     # 20               GAGA     - (2) INFORMATION FOR SEQ ID NO:19:     -      (i) SEQUENCE CHARACTERISTICS:     #pairs    (A) LENGTH: 20 base               (B) TYPE: nucleic acid               (C) STRANDEDNESS: single               (D) TOPOLOGY: linear     -     (ii) MOLECULE TYPE: DNA     -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:19:     # 20               TCGT     - (2) INFORMATION FOR SEQ ID NO:20:     -      (i) SEQUENCE CHARACTERISTICS:     #pairs    (A) LENGTH: 2331 base               (B) TYPE: nucleic acid               (C) STRANDEDNESS: single               (D) TOPOLOGY: linear     -     (ii) MOLECULE TYPE: DNA     -     (ix) FEATURE:               (A) NAME/KEY: CDS               (B) LOCATION: 70..2289     -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:20:     - CGATATCCGA CACTTCGCGA TCACATCCGT GATCACAGCC CGATAACACC AA - #CTCCTGGA       60     - AGGAATGCT GTG CCC GAG CAA CAC CCA CCC ATT ACA - # GAA ACC ACC ACC      108     #Pro Pro Ile Thr Glu Thr Thr Thr     #         10     - GGA GCC GCT AGC AAC GGC TGT CCC GTC GTG GG - #T CAT ATG AAA TAC CCC      156     Gly Ala Ala Ser Asn Gly Cys Pro Val Val Gl - #y His Met Lys Tyr Pro     #     25     - GTC GAG GGC GGC GGA AAC CAG GAC TGG TGG CC - #C AAC CGG CTC AAT CTG      204     Val Glu Gly Gly Gly Asn Gln Asp Trp Trp Pr - #o Asn Arg Leu Asn Leu     # 45     - AAG GTA CTG CAC CAA AAC CCG GCC GTC GCT GA - #C CCG ATG GGT GCG GCG      252     Lys Val Leu His Gln Asn Pro Ala Val Ala As - #p Pro Met Gly Ala Ala     #                 60     - TTC GAC TAT GCC GCG GAG GTC GCG ACC ATC GA - #C GTT GAC GCC CTG ACG      300     Phe Asp Tyr Ala Ala Glu Val Ala Thr Ile As - #p Val Asp Ala Leu Thr     #             75     - CGG GAC ATC GAG GAA GTG ATG ACC ACC TCG CA - #G CCG TGG TGG CCC GCC      348     Arg Asp Ile Glu Glu Val Met Thr Thr Ser Gl - #n Pro Trp Trp Pro Ala     #         90     - GAC TAC GGC CAC TAC GGG CCG CTG TTT ATC CG - #G ATG GCG TGG CAC GCT      396     Asp Tyr Gly His Tyr Gly Pro Leu Phe Ile Ar - #g Met Ala Trp His Ala     #    105     - GCC GGC ACC TAC CGC ATC CAC GAC GGC CGC GG - #C GGC GCC GGG GGC GGC      444     Ala Gly Thr Tyr Arg Ile His Asp Gly Arg Gl - #y Gly Ala Gly Gly Gly     110                 1 - #15                 1 - #20                 1 -     #25     - ATG CAG CGG TTC GCG CCG CTT AAC AGC TGG CC - #C GAC AAC GCC AGC TTG      492     Met Gln Arg Phe Ala Pro Leu Asn Ser Trp Pr - #o Asp Asn Ala Ser Leu     #               140     - GAC AAG GCG CGC CGG CTG CTG TGG CCG GTC AA - #G AAG AAG TAC GGC AAG      540     Asp Lys Ala Arg Arg Leu Leu Trp Pro Val Ly - #s Lys Lys Tyr Gly Lys     #           155     - AAG CTC TCA TGG GCG GAC CTG ATT GTT TTC GC - #C GGC AAC TGC GCG CTG      588     Lys Leu Ser Trp Ala Asp Leu Ile Val Phe Al - #a Gly Asn Cys Ala Leu     #       170     - GAA TCG ATG GGC TTC AAG ACG TTC GGG TTC GG - #C TTC GGC CGG GTC GAC      636     Glu Ser Met Gly Phe Lys Thr Phe Gly Phe Gl - #y Phe Gly Arg Val Asp     #   185     - CAG TGG GAG CCC GAT GAG GTC TAT TGG GGC AA - #G GAA GCC ACC TGG CTC      684     Gln Trp Glu Pro Asp Glu Val Tyr Trp Gly Ly - #s Glu Ala Thr Trp Leu     190                 1 - #95                 2 - #00                 2 -     #05     - GGC GAT GAG CGT TAC AGC GGT AAG CGG GAT CT - #G GAG AAC CCG CTG GCC      732     Gly Asp Glu Arg Tyr Ser Gly Lys Arg Asp Le - #u Glu Asn Pro Leu Ala     #               220     - GCG GTG CAG ATG GGG CTG ATC TAC GTG AAC CC - #G GAG GGG CCG AAC GGC      780     Ala Val Gln Met Gly Leu Ile Tyr Val Asn Pr - #o Glu Gly Pro Asn Gly     #           235     - AAC CCG GAC CCC ATG GCC GCG GCG GTC GAC AT - #T CGC GAG ACG TTT CGG      828     Asn Pro Asp Pro Met Ala Ala Ala Val Asp Il - #e Arg Glu Thr Phe Arg     #       250     - CGC ATG GCC ATG AAC GAC GTC GAA ACA GCG GC - #G CTG ATC GTC GGC GGT      876     Arg Met Ala Met Asn Asp Val Glu Thr Ala Al - #a Leu Ile Val Gly Gly     #   265     - CAC ACT TTC GGT AAG ACC CAT GGC GCC GGC CC - #G GCC GAT CTG GTC GGC      924     His Thr Phe Gly Lys Thr His Gly Ala Gly Pr - #o Ala Asp Leu Val Gly     270                 2 - #75                 2 - #80                 2 -     #85     - CCC GAA CCC GAG GCT GCT CCG CTG GAG CAG AT - #G GGC TTG GGC TGG AAG      972     Pro Glu Pro Glu Ala Ala Pro Leu Glu Gln Me - #t Gly Leu Gly Trp Lys     #               300     - AGC TCG TAT GGC ACC GGA ACC GGT AAG GAC GC - #G ATC ACC AGC GGC ATC     1020     Ser Ser Tyr Gly Thr Gly Thr Gly Lys Asp Al - #a Ile Thr Ser Gly Ile     #           315     - GAG GTC GTA TGG ACG AAC ACC CCG ACG AAA TG - #G GAC AAC AGT TTC CTC     1068     Glu Val Val Trp Thr Asn Thr Pro Thr Lys Tr - #p Asp Asn Ser Phe Leu     #       330     - GAG ATC CTG TAC GGC TAC GAG TGG GAG CTG AC - #G AAG AGC CCT GCT GGC     1116     Glu Ile Leu Tyr Gly Tyr Glu Trp Glu Leu Th - #r Lys Ser Pro Ala Gly     #   345     - GCT TGG CAA TAC ACC GCC AAG GAC GGC GCC GG - #T GCC GGC ACC ATC CCG     1164     Ala Trp Gln Tyr Thr Ala Lys Asp Gly Ala Gl - #y Ala Gly Thr Ile Pro     350                 3 - #55                 3 - #60                 3 -     #65     - GAC CCG TTC GGC GGG CCA GGG CGC TCC CCG AC - #G ATG CTG GCC ACT GAC     1212     Asp Pro Phe Gly Gly Pro Gly Arg Ser Pro Th - #r Met Leu Ala Thr Asp     #               380     - CTC TCG CTG CGG GTG GAT CCG ATC TAT GAG CG - #G ATC ACG CGT CGC TGG     1260     Leu Ser Leu Arg Val Asp Pro Ile Tyr Glu Ar - #g Ile Thr Arg Arg Trp     #           395     - CTG GAA CAC CCC GAG GAA TTG GCC GAC GAG TT - #C GCC AAG GCC TGG TAC     1308     Leu Glu His Pro Glu Glu Leu Ala Asp Glu Ph - #e Ala Lys Ala Trp Tyr     #       410     - AAG CTG ATC CAC CGA GAC ATG GGT CCC GTT GC - #G AGA TAC CTT GGG CCG     1356     Lys Leu Ile His Arg Asp Met Gly Pro Val Al - #a Arg Tyr Leu Gly Pro     #   425     - CTG GTC CCC AAG CAG ACC CTG CTG TGG CAG GA - #T CCG GTC CCT GCG GTC     1404     Leu Val Pro Lys Gln Thr Leu Leu Trp Gln As - #p Pro Val Pro Ala Val     430                 4 - #35                 4 - #40                 4 -     #45     - AGC CAC GAC CTC GTC GGC GAA GCC GAG ATT GC - #C AGC CTT AAG AGC CAG     1452     Ser His Asp Leu Val Gly Glu Ala Glu Ile Al - #a Ser Leu Lys Ser Gln     #               460     - ATC CGG GCA TCG GGA TTG ACT GTC TCA CAG CT - #A GTT TCG ACC GCA TGG     1500     Ile Arg Ala Ser Gly Leu Thr Val Ser Gln Le - #u Val Ser Thr Ala Trp     #           475     - GCG GCG GCG TCG TCG TTC CGT GGT AGC GAC AA - #G CGC GGC GGC GCC AAC     1548     Ala Ala Ala Ser Ser Phe Arg Gly Ser Asp Ly - #s Arg Gly Gly Ala Asn     #       490     - GGT GGT CGC ATC CGC CTG CAG CCA CAA GTC GG - #G TGG GAG GTC AAC GAC     1596     Gly Gly Arg Ile Arg Leu Gln Pro Gln Val Gl - #y Trp Glu Val Asn Asp     #   505     - CCC GAC GGG GAT CTG CGC AAG GTC ATT CGC AC - #C CTG GAA GAG ATC CAG     1644     Pro Asp Gly Asp Leu Arg Lys Val Ile Arg Th - #r Leu Glu Glu Ile Gln     510                 5 - #15                 5 - #20                 5 -     #25     - GAG TCA TTC AAC TCC GCG GCG CCG GGG AAC AT - #C AAA GTG TCC TTC GCC     1692     Glu Ser Phe Asn Ser Ala Ala Pro Gly Asn Il - #e Lys Val Ser Phe Ala     #               540     - GAC CTC GTC GTG CTC GGT GGC TGT GCC GCC AT - #A GAG AAA GCA GCA AAG     1740     Asp Leu Val Val Leu Gly Gly Cys Ala Ala Il - #e Glu Lys Ala Ala Lys     #           555     - GCG GCT GGC CAC AAC ATC ACG GTG CCC TTC AC - #C CCG GGC CGC ACG GAT     1788     Ala Ala Gly His Asn Ile Thr Val Pro Phe Th - #r Pro Gly Arg Thr Asp     #       570     - GCG TCG CAG GAA CAA ACC GAC GTG GAA TCC TT - #T GCC GTG CTG GAG CCC     1836     Ala Ser Gln Glu Gln Thr Asp Val Glu Ser Ph - #e Ala Val Leu Glu Pro     #   585     - AAG GCA GAT GGC TTC CGA AAC TAC CTC GGA AA - #G GGC AAC CCG TTG CCG     1884     Lys Ala Asp Gly Phe Arg Asn Tyr Leu Gly Ly - #s Gly Asn Pro Leu Pro     590                 5 - #95                 6 - #00                 6 -     #05     - GCC GAG TAC ATG CTG CTC GAC AAG GCG AAC CT - #G CTT ACG CTC AGT GCC     1932     Ala Glu Tyr Met Leu Leu Asp Lys Ala Asn Le - #u Leu Thr Leu Ser Ala     #               620     - CCT GAG ATG ACG GTG CTG GTA GGT GGC CTG CG - #C GTC CTC GGC GCA AAC     1980     Pro Glu Met Thr Val Leu Val Gly Gly Leu Ar - #g Val Leu Gly Ala Asn     #           635     - TAC AAG CGC TTA CCG CTG GGC GTG TTC ACC GA - #G GCC TCC GAG TCA CTG     2028     Tyr Lys Arg Leu Pro Leu Gly Val Phe Thr Gl - #u Ala Ser Glu Ser Leu     #       650     - ACC AAC GAC TTC TTC GTG AAC CTG CTC GAC AT - #G GGT ATC ACC TGG GAG     2076     Thr Asn Asp Phe Phe Val Asn Leu Leu Asp Me - #t Gly Ile Thr Trp Glu     #   665     - CCC TCG CCA GCA GAT GAC GGG ACC TAC CAG GG - #C AAG GAT GGC AGT GGC     2124     Pro Ser Pro Ala Asp Asp Gly Thr Tyr Gln Gl - #y Lys Asp Gly Ser Gly     670                 6 - #75                 6 - #80                 6 -     #85     - AAG GTG AAG TGG ACC GGC AGC CGC GTG GAC CT - #G GTC TTC GGG TCC AAC     2172     Lys Val Lys Trp Thr Gly Ser Arg Val Asp Le - #u Val Phe Gly Ser Asn     #               700     - TCG GAG TTG CGG GCG CTT GTC GAG GTC TAT GG - #C GCC GAT GAC GCG CAG     2220     Ser Glu Leu Arg Ala Leu Val Glu Val Tyr Gl - #y Ala Asp Asp Ala Gln     #           715     - CCG AAG TTC GTG CAG GAC TTC GTC GCT GCC TG - #G GAC AAG GTG ATG AAC     2268     Pro Lys Phe Val Gln Asp Phe Val Ala Ala Tr - #p Asp Lys Val Met Asn     #       730     - CTC GAC AGG TTC GAC GTG CGC TGATTCGGGT TGATCGGCC - #C TGCCCGCCGA     2319     Leu Asp Arg Phe Asp Val Arg     #   740     #     2331     - (2) INFORMATION FOR SEQ ID NO:21:     -      (i) SEQUENCE CHARACTERISTICS:     #acids    (A) LENGTH: 740 amino               (B) TYPE: amino acid               (D) TOPOLOGY: linear     -     (ii) MOLECULE TYPE: protein     -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:21:     - Val Pro Glu Gln His Pro Pro Ile Thr Glu Th - #r Thr Thr Gly Ala Ala     #                 15     - Ser Asn Gly Cys Pro Val Val Gly His Met Ly - #s Tyr Pro Val Glu Gly     #             30     - Gly Gly Asn Gln Asp Trp Trp Pro Asn Arg Le - #u Asn Leu Lys Val Leu     #         45     - His Gln Asn Pro Ala Val Ala Asp Pro Met Gl - #y Ala Ala Phe Asp Tyr     #     60     - Ala Ala Glu Val Ala Thr Ile Asp Val Asp Al - #a Leu Thr Arg Asp Ile     # 80     - Glu Glu Val Met Thr Thr Ser Gln Pro Trp Tr - #p Pro Ala Asp Tyr Gly     #                 95     - His Tyr Gly Pro Leu Phe Ile Arg Met Ala Tr - #p His Ala Ala Gly Thr     #           110     - Tyr Arg Ile His Asp Gly Arg Gly Gly Ala Gl - #y Gly Gly Met Gln Arg     #       125     - Phe Ala Pro Leu Asn Ser Trp Pro Asp Asn Al - #a Ser Leu Asp Lys Ala     #   140     - Arg Arg Leu Leu Trp Pro Val Lys Lys Lys Ty - #r Gly Lys Lys Leu Ser     145                 1 - #50                 1 - #55                 1 -     #60     - Trp Ala Asp Leu Ile Val Phe Ala Gly Asn Cy - #s Ala Leu Glu Ser Met     #               175     - Gly Phe Lys Thr Phe Gly Phe Gly Phe Gly Ar - #g Val Asp Gln Trp Glu     #           190     - Pro Asp Glu Val Tyr Trp Gly Lys Glu Ala Th - #r Trp Leu Gly Asp Glu     #       205     - Arg Tyr Ser Gly Lys Arg Asp Leu Glu Asn Pr - #o Leu Ala Ala Val Gln     #   220     - Met Gly Leu Ile Tyr Val Asn Pro Glu Gly Pr - #o Asn Gly Asn Pro Asp     225                 2 - #30                 2 - #35                 2 -     #40     - Pro Met Ala Ala Ala Val Asp Ile Arg Glu Th - #r Phe Arg Arg Met Ala     #               255     - Met Asn Asp Val Glu Thr Ala Ala Leu Ile Va - #l Gly Gly His Thr Phe     #           270     - Gly Lys Thr His Gly Ala Gly Pro Ala Asp Le - #u Val Gly Pro Glu Pro     #       285     - Glu Ala Ala Pro Leu Glu Gln Met Gly Leu Gl - #y Trp Lys Ser Ser Tyr     #   300     - Gly Thr Gly Thr Gly Lys Asp Ala Ile Thr Se - #r Gly Ile Glu Val Val     305                 3 - #10                 3 - #15                 3 -     #20     - Trp Thr Asn Thr Pro Thr Lys Trp Asp Asn Se - #r Phe Leu Glu Ile Leu     #               335     - Tyr Gly Tyr Glu Trp Glu Leu Thr Lys Ser Pr - #o Ala Gly Ala Trp Gln     #           350     - Tyr Thr Ala Lys Asp Gly Ala Gly Ala Gly Th - #r Ile Pro Asp Pro Phe     #       365     - Gly Gly Pro Gly Arg Ser Pro Thr Met Leu Al - #a Thr Asp Leu Ser Leu     #   380     - Arg Val Asp Pro Ile Tyr Glu Arg Ile Thr Ar - #g Arg Trp Leu Glu His     385                 3 - #90                 3 - #95                 4 -     #00     - Pro Glu Glu Leu Ala Asp Glu Phe Ala Lys Al - #a Trp Tyr Lys Leu Ile     #               415     - His Arg Asp Met Gly Pro Val Ala Arg Tyr Le - #u Gly Pro Leu Val Pro     #           430     - Lys Gln Thr Leu Leu Trp Gln Asp Pro Val Pr - #o Ala Val Ser His Asp     #       445     - Leu Val Gly Glu Ala Glu Ile Ala Ser Leu Ly - #s Ser Gln Ile Arg Ala     #   460     - Ser Gly Leu Thr Val Ser Gln Leu Val Ser Th - #r Ala Trp Ala Ala Ala     465                 4 - #70                 4 - #75                 4 -     #80     - Ser Ser Phe Arg Gly Ser Asp Lys Arg Gly Gl - #y Ala Asn Gly Gly Arg     #               495     - Ile Arg Leu Gln Pro Gln Val Gly Trp Glu Va - #l Asn Asp Pro Asp Gly     #           510     - Asp Leu Arg Lys Val Ile Arg Thr Leu Glu Gl - #u Ile Gln Glu Ser Phe     #       525     - Asn Ser Ala Ala Pro Gly Asn Ile Lys Val Se - #r Phe Ala Asp Leu Val     #   540     - Val Leu Gly Gly Cys Ala Ala Ile Glu Lys Al - #a Ala Lys Ala Ala Gly     545                 5 - #50                 5 - #55                 5 -     #60     - His Asn Ile Thr Val Pro Phe Thr Pro Gly Ar - #g Thr Asp Ala Ser Gln     #               575     - Glu Gln Thr Asp Val Glu Ser Phe Ala Val Le - #u Glu Pro Lys Ala Asp     #           590     - Gly Phe Arg Asn Tyr Leu Gly Lys Gly Asn Pr - #o Leu Pro Ala Glu Tyr     #       605     - Met Leu Leu Asp Lys Ala Asn Leu Leu Thr Le - #u Ser Ala Pro Glu Met     #   620     - Thr Val Leu Val Gly Gly Leu Arg Val Leu Gl - #y Ala Asn Tyr Lys Arg     625                 6 - #30                 6 - #35                 6 -     #40     - Leu Pro Leu Gly Val Phe Thr Glu Ala Ser Gl - #u Ser Leu Thr Asn Asp     #               655     - Phe Phe Val Asn Leu Leu Asp Met Gly Ile Th - #r Trp Glu Pro Ser Pro     #           670     - Ala Asp Asp Gly Thr Tyr Gln Gly Lys Asp Gl - #y Ser Gly Lys Val Lys     #       685     - Trp Thr Gly Ser Arg Val Asp Leu Val Phe Gl - #y Ser Asn Ser Glu Leu     #   700     - Arg Ala Leu Val Glu Val Tyr Gly Ala Asp As - #p Ala Gln Pro Lys Phe     705                 7 - #10                 7 - #15                 7 -     #20     - Val Gln Asp Phe Val Ala Ala Trp Asp Lys Va - #l Met Asn Leu Asp Arg     #               735     - Phe Asp Val Arg                 740     - (2) INFORMATION FOR SEQ ID NO:22:     -      (i) SEQUENCE CHARACTERISTICS:     #pairs    (A) LENGTH: 15 base               (B) TYPE: nucleic acid               (C) STRANDEDNESS: single               (D) TOPOLOGY: linear     -     (ii) MOLECULE TYPE: DNA     -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:22:     #    15     __________________________________________________________________________ 

What is claimed is:
 1. A method for selectively detecting M. tuberculosis in a sample containing DNA, said method comprising amplifying the DNA to generate a detectable amount of amplified DNA comprising a katG DNA fragment consisting of base 904 through 1523 of the M. tuberculosis katG gene as depicted in FIG. 7 (SEQ ID NO:20), wherein the generation of the katG DNA fragment is indicative of the presence of M. tuberculosis in the sample.
 2. The method of claim 1 wherein the DNA is amplified in a polymerase chain reaction using oligonucleotide primer katG904 (SEQ ID NO:16) and oligonucleotide primer katG1523 (SEQ ID NO:17).
 3. The method of claim 1 wherein the katG DNA fragment comprises a restriction site comprising either a G or a C at the nucleotide position occupied by base 1013 in codon 315 of the M. tuberculosis katG gene as depicted in FIG. 7 (SEQ ID NO:20), said method further comprising:contacting the katG DNA fragment with a restriction endonuclease that cleaves either at said restriction site comprising a G at the nucleotide position occupied by base 1013 of codon 315, or at said restriction site comprising a C at the nucleotide position occupied by base 1013 of codon 315, but not at both of said restriction sites, to yield at least one cleaved fragment; electrophoresing the at least one cleaved fragment to yield an electrophoretic mobility pattern comprising the at least one cleaved fragment; and analyzing the mobility pattern to selectively detect the presence of M. tuberculosis in the sample.
 4. The method of claim 3 wherein the restriction endonuclease is MspI.
 5. The method of claim 3 wherein the electrophoresis comprises gel electrophoresis, and wherein the presence of M. tuberculosis in the sample is selectively detected using restriction fragment length polymorphism (RFLP) analysis of said electrophoretic mobility pattern.
 6. The method of claim 1 wherein the sample is a biological fluid.
 7. The method of claim 6 wherein the biological fluid is human sputum.
 8. The method of claim 1 further comprising determining whether or not the katG DNA fragment has a S315T mutation, wherein the presence of a S315T mutation is indicative of an INH-resistant strain of M. tuberculosis.
 9. The method of claim 8 wherein the katG DNA fragment comprises a restriction site comprising either a G or a C at the nucleotide position occupied by base 1013 in codon 315 of the M. tuberculosis katG gene as depicted in FIG. 7 (SEQ ID NO:20), and wherein the step of determining whether or not the katG DNA fragment has a S315T mutation comprises contacting the katG DNA fragment with a restriction endonuclease that cleaves either at said restriction site comprising a G at the nucleotide position occupied by base 1013 of codon 315, or at said restriction site comprising a C at the nucleotide position occupied by base 1013 of codon 315, but not at both of said restriction sites, to yield at least one cleaved fragment, and wherein cleavage at said restriction site is indicative of either the presence or the absence, but not both, of a S31 ST mutation in the katG DNA fragment.
 10. The method of claim 9 wherein the restriction endonuclease is MspI, and wherein cleavage at said restriction site is indicative of the presence of a S315T mutation in the katG DNA fragment.
 11. The method of claim 9 further comprisingelectrophoresing the at least one cleaved fragment to yield an electrophoretic mobility pattern comprising the at least one cleaved fragment; and analyzing the mobility pattern to determine the presence or absence of a S315T mutation in the katG DNA fragment.
 12. The method of claim 11 wherein the electrophoresis comprises gel electrophoresis, and wherein the presence of M. tuberculosis in the sample is selectively detected using restriction fragment length polymorphism (RFI,P) analysis of said electrophoretic mobility pattern.
 13. A method for selectively detecting M. tuberculosis in a sample containing DNA, said method comprising:(a) amplifying the DNA to generate a detectable amount of amplified DNA comprising a katG DNA fragment comprising base 904 through 1523 of the M. tuberculosis katG gene, wherein the katG DNA fragment further comprises a restriction site comprising either a G or a C at the nucleotide position occupied by base 1013 in codon 315 of the M. tuberculosis katG gene as depicted in FIG. 7 (SEQ ID NO:20); (b) contacting the katG DNA fragment with a restriction endonuclease that cleaves either at said restriction site comprising a G at the nucleotide position occupied by base 1013 of codon 315, or at said restriction site comprising a C at the nucleotide position occupied by base 1013 of codon 315, but not at both of said restriction sites, to yield at least one cleaved fragment; (c) electrophoresing the at least one cleaved fragment to yield an electrophoretic mobility pattern comprising the at least one cleaved fragment; and (d) analyzing the mobility pattern to selectively detect the presence of M. tuberculosis in the sample.
 14. The method of claim 13 wherein the restriction endonuclease is MspI.
 15. The method of claim 13 wherein the sample is a biological fluid.
 16. The method of claim 15 wherein the biological fluid is human sputum.
 17. The method of claim 13 wherein the electrophoresis comprises gel electrophoresis, and wherein the presence of M. tuberculosis in the sample is selectively detected using restriction fragment length polymorphism (RFLP) analysis of said electrophoretic mobility pattern.
 18. A method for selectively detecting M. tuberculosis in a sample containing DNA, said method comprising:(a) amplifying the DNA to generate a detectable amount of amplified DNA comprising a katG DNA fragment comprising base 904 12through 1523 of the M. tuberculosis katG gene as depicted in FIG. 7 (SEQ ID NO:20); (b) contacting the katG DNA fragment with a restriction endonuclease that cleaves at C/CGG to yield at least one cleaved fragment; (c) electrophoresing the at least one cleaved fragment to yield a mobility pattern comprising the at least one cleaved fragment; and (d) analyzing the mobility pattern to selectively detect the presence of M. tuberculosis in the sample.
 19. The method of claim 18 wherein the restriction endonuclease is MspI.
 20. The method of claim 18 wherein the sample is a biological fluid.
 21. The method of claim 20 wherein the biological fluid is human sputum.
 22. The method of claim 18 wherein the electrophoresis comprises gel electrophoresis, and wherein the presence of M. tuberculosis in the sample is selectively detected using restriction fragment length polymorphism (RFLP) analysis of said electrophoretic mobility pattern. 